Show HN: Memweave CLI – search your AI agent's memory from the shell

hackernews | 2026년 4월 25일 21:39 | 📰 뉴스

#claude #letta #mem0 #mnemora #로컬 메모리

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Memweave는 인프라가 필요 없는 파이썬 라이러리로, AI 에이전트의 기억을 마크다운 파일과 SQLite에 저장하여 영구적이고 검색 가능하게 만듭니다. BM25 키워드 검색과 시맨틱 벡터 검색을 결합하여 정확한 일치뿐만 아니라 개념적으로 관련된 콘텐츠도 찾을 수 있습니다. 모든 기억은 일반 텍스트 파일로 저장되어 사용자가 직접 편집하거나 버전 관리를 할 수 있으며, 임베딩은 캐싱을 통해 효율적으로 관리됩니다.

본문

Agent memory you can read, search, and git diff . memweave is a zero-infrastructure, async-first Python library that gives AI agents persistent, searchable memory — stored as plain Markdown files and indexed by SQLite. No external services. No black-box databases. Every memory is a file you can open, edit, grep, and version-control. - 📄 Human-readable by design. Memories live in plain .md files on disk. Open them in your editor, inspect them in your terminal, orgit diff what your agent learned between runs. - 🔍 Hybrid search out of the box. Combines BM25 keyword ranking (FTS5) with semantic vector search (sqlite-vec) and merges them — so "PostgreSQL JSONB" finds both exact matches and conceptually related content. - ⚡ Zero LLM calls on core operations. Writing and searching memories never touches an LLM. Embeddings are cached by content hash — compute once, reuse forever. - 🌐 Works completely offline. If your embedding API is down, memweave falls back to pure keyword search. It never crashes; it degrades gracefully. - 💸 Zero server cost, zero setup. The entire memory store is a single SQLite file on disk — no vector database to provision, no cloud service to pay for, no Docker container to manage. - 🔌 Pluggable at every layer. Swap in a custom search strategy, add a post-processing step, or bring your own embedding provider via a single protocol. - 📅 Memories age naturally. Recent knowledge ranks above stale context automatically — no manual cleanup, no ever-growing noise. Foundational facts stay exempt. - 🎯 No redundant results. MMR re-ranking ensures the top results cover different aspects of your query — not the same fact repeated from five slightly different chunks. - Quickstart Guide - How it works - Core concepts - CLI - Usage examples - Configuring memweave - API reference - Contributing - License pip install memweave Set an embedding provider (or skip to use keyword-only mode): export OPENAI_API_KEY=sk-... import asyncio from pathlib import Path from memweave import MemWeave, MemoryConfig async def main(): async with MemWeave(MemoryConfig(workspace_dir=".")) as mem: # Write a memory file, then index it memory_file = Path("memory/preferences.md") memory_file.parent.mkdir(exist_ok=True) memory_file.write_text("The user prefers dark mode and concise answers.") await mem.add(memory_file) # Search across all memories. # min_score=0.0 ensures results surface in a small corpus; # in production the default 0.35 threshold filters low-confidence matches. results = await mem.search("What is the user preference?", min_score=0.0) for r in results: print(f"[{r.score:.2f}] {r.snippet} ← {r.path}:{r.start_line}") asyncio.run(main()) Memories are plain Markdown files in memory/ . Inspect them any time: cat memory/*.md Each result includes a relevance score and the exact file and line it came from: [0.35] The user prefers dark mode and concise answers. ← memory/preferences.md:1 memweave separates storage from search: ┌──────────────────────────────────────────────────────────────┐ │ SOURCE OF TRUTH (Markdown files) │ │ memory/MEMORY.md ← evergreen knowledge │ │ memory/2026-03-21.md ← daily logs │ │ memory/agents/coder/ ← agent-scoped namespace │ └───────────────────────┬──────────────────────────────────────┘ │ chunking → hashing → embedding ┌───────────────────────▼──────────────────────────────────────┐ │ DERIVED INDEX (SQLite) │ │ chunks — text + metadata │ │ chunks_fts — FTS5 full-text index (BM25) │ │ chunks_vec — sqlite-vec SIMD index (cosine) │ │ embedding_cache — hash → vector (skip re-embedding) │ │ files — SHA-256 change detection │ └───────────────────────┬──────────────────────────────────────┘ │ hybrid merge → post-processing ▼ list[SearchResult] Write path — await mem.add(path) takes any Markdown file you've written — dated, evergreen, agent-scoped, or session — chunks it, checks the embedding cache (hash lookup), calls the embedding API only on a miss, and inserts into both the FTS5 and vector tables. No LLM involved. Search path — await mem.search(query) embeds the query, runs vector search and keyword search in parallel, merges scores (0.7 × vector + 0.3 × BM25 ), applies post-processors (threshold → temporal decay → MMR), and returns ranked results. The SQLite index is a derived cache — always rebuildable from the Markdown files. This means: - You can edit memories directly in your editor and re-index with await mem.index() . git diff memory/ shows exactly what an agent learned between commits.- Losing the database is not data loss. Losing the files is. | File | Behaviour | |---|---| memory/MEMORY.md | Evergreen — never decays, write-protected during flush() | memory/2026-03-21.md | Dated — subject to temporal decay (older memories rank lower) | memory/researcher_agent/ | Agent-scoped — isolated namespace per agent | memory/episodes/known-facts.md | Evergreen — non-dated file in a subdirectory, always full score | memory/sessions/2026-04-01.md | Dated — subdirectory dated file, decays by filename date | Evergreen files hold foundational facts that should always surface at full score. Dated files accumulate daily learning and fade naturally — recent memories rank higher. Every file gets a source label derived from its path — the immediate subdirectory under memory/ becomes the label: | File path | source | |---|---| memory/notes.md | "memory" | memory/sessions/2026-04-03.md | "sessions" | memory/researcher_agent/findings.md | "researcher_agent" | Outside memory/ | "external" | Pass source_filter="researcher_agent" to search() to scope results exclusively to that namespace. Only the first path component counts — memory/researcher_agent/sub/x.md has source "researcher_agent" , not "sub" . Every mem.search(query) call moves through five fixed stages in order: query │ ┌────────────┴────────────┐ │ │ FTS5 BM25 (keyword) sqlite-vec ANN (semantic) exact term matching cosine similarity │ │ └────────────┬────────────┘ │ weighted merge │ score = 0.7 × vector + 0.3 × BM25 │ ┌────────▼────────┐ │ ScoreThreshold │ drop results below min_score (default 0.35) └────────┬────────┘ │ ┌────────▼────────┐ │ TemporalDecay │ multiply score by exp(−λ × age_days) │ (opt-in) │ evergreen files exempt └────────┬────────┘ │ ┌────────▼────────┐ │ MMR Reranker │ reorder for relevance + diversity │ (opt-in) │ λ × relevance − (1−λ) × similarity_to_selected └────────┬────────┘ │ ┌────────▼────────┐ │ Custom │ your own PostProcessor(s) │ processors │ via mem.register_postprocessor() └────────┬────────┘ │ list[SearchResult] Stage 1 — Hybrid merge. Both backends run against the same query. FTS5 BM25 catches exact keyword matches (error codes, config values, proper names). sqlite-vec cosine catches semantically related content even when no keyword overlaps. Scores are normalised and merged: 0.7 × vector_score + 0.3 × bm25_score . Weights are tunable via HybridConfig . Stage 2 — Score threshold. Drops any result whose merged score is below min_score (default 0.35 ). Acts as a noise gate — prevents low-confidence matches from entering the post-processing stages. Always active; override per-call with mem.search(query, min_score=0.5) . Stage 3 — Temporal decay (opt-in). Multiplies each result's score by an exponential factor based on the age of its source file. Recent memories rank higher; old ones fade naturally. Evergreen files are exempt and always surface at full score. See Temporal decay below. Stage 4 — MMR re-ranking (opt-in). Reorders the remaining results to balance relevance against diversity. Prevents the top results from being near-duplicates of each other. See MMR re-ranking below. Stage 5 — Custom processors. Any processors registered with mem.register_postprocessor() run last, in registration order. Each receives the output of the previous stage and can filter, reorder, or rescore freely. Agents accumulate knowledge over time — but not all knowledge ages equally. A decision made yesterday should outrank one made six months ago when both are semantically relevant. Without decay, a stale debugging note from last quarter can surface above this morning's architecture decision simply because it embeds well. Temporal decay solves this by multiplying each result's score by a factor that shrinks the older the source file is. The score is never zeroed out — old memories still surface, they just rank lower than recent ones. How the formula works: λ = ln(2) / half_life_days multiplier = exp(−λ × age_days) decayed_score = original_score × multiplier At age_days = 0 the multiplier is 1.0 — no change. At age_days = half_life_days it is exactly 0.5 . The curve is smooth and continuous, so a file that is two half-lives old scores at 0.25× , three half-lives at 0.125× , and so on. With the default half_life_days=30 : | File age | Multiplier | Effect on a 0.80 score | |---|---|---| | Today | 1.00 | 0.80 (unchanged) | | 30 days | 0.50 | 0.40 | | 60 days | 0.25 | 0.20 | | 90 days | 0.13 | 0.10 | How age is determined — three file categories: | File | Age source | Decays? | |---|---|---| memory/MEMORY.md , memory/architecture.md (any non-dated file directly under memory/ ) | — | No — evergreen, always full score | memory/agents/notes.md (non-dated file in any memory/ subdirectory) | — | No — evergreen, same rule as root non-dated files | memory/2026-03-21.md (dated daily log) | Date parsed from filename | Yes | memory/sessions/2026-03-21.md (dated file in any memory/ subdirectory) | Date parsed from filename | Yes — same rule as root dated files | Evergreen files hold foundational facts — stack choices, hard constraints, permanent preferences — that should always surface at full score regardless of when they were written. Daily logs capture evolving context and fade naturally as new sessions add fresher knowledge. Enabling temporal decay: from memweave import MemWeave from memweave.config import MemoryConfig, QueryConfig, TemporalDecayConfig config = MemoryConfig( query=QueryConfig( temporal_decay=TemporalDecayConfig( enabled=True, half_life_days=30.0, # score halves every 30 days; tune to your workflow ), ), ) async with MemWeave(config) as mem: results = await mem.search("database choice") # results from last week will rank above results from last quarter # results from memory/MEMORY.md are exempt and always surface at full score Tune half_life_days to your workflow: 7 for fast-moving projects where week-old context is already stale, 90 for research or documentation repositories where knowledge stays relevant for months. Without diversity control, the top results from a hybrid search are often near-duplicates — multiple chunks from the same file, or different phrasings of the same fact. An agent loading all of them into its context window wastes tokens and misses other relevant but different memories. MMR (Maximal Marginal Relevance) reorders results after scoring to balance how relevant a result is against how similar it is to results already selected. At each step it picks the candidate that maximises: MMR score = λ × relevance − (1−λ) × max_similarity_to_already_selected Similarity is computed as Jaccard overlap between the token sets of the candidate and each already-selected result. This means two chunks that share many of the same words — even from different files — are treated as redundant, and the second one is pushed down in favour of something genuinely different. The lambda_param dial: lambda_param | Behaviour | |---|---| 1.0 | Pure relevance — identical to no MMR (no-op) | 0.7 | Default — strong relevance bias, light diversity push | 0.5 | Equal weight — relevance and diversity balanced | 0.0 | Pure diversity — maximally novel results, relevance ignored | Enabling MMR: from memweave import MemWeave from memweave.config import MemoryConfig, QueryConfig, MMRConfig config = MemoryConfig( query=QueryConfig( mmr=MMRConfig( enabled=True, lambda_param=0.7, # 0 = max diversity, 1 = max relevance ), ), ) async with MemWeave(config) as mem: results = await mem.search("deployment steps") # top results will cover different aspects of deployment # rather than returning the same facts from multiple angles # override λ per-call without touching the config diverse = await mem.search("deployment steps", mmr_lambda=0.3) MMR runs after temporal decay, so the diversity pass operates on already age-adjusted scores — the reranker sees a realistic picture of which results actually matter before deciding what is redundant. pip install memweave registers a memweave binary alongside the Python library. Every command is a thin shell over the same MemWeave public methods, so anything you can do from Python you can do from a terminal, a shell script, or a CI step — without writing a single line of Python. This is particularly useful for: - Inspecting agent memory without opening a Python REPL — browse what's indexed, check scores, read snippets directly in the terminal. - Shell scripts and CI pipelines — index a workspace after a build, search for a known fact and fail the pipeline if it isn't there, or export results as JSON for downstream tools. - Agents that orchestrate subprocesses — an LLM running a bash tool can call memweave search and parse the JSON output without embedding the library. All commands accept --workspace / -w to point at any directory and --embedding-model to override the model. Every command that produces structured output accepts --json for machine-readable output. Scan the workspace for .md files and embed any that have changed since the last run. Uses SHA-256 hashing to skip unchanged files — fast on large workspaces. # Index a workspace memweave index --workspace ./my_project --embedding-model text-embedding-3-small # Force re-embed everything regardless of hash memweave index --workspace ./my_project --embedding-model text-embedding-3-small --force Index a single file immediately. Useful after writing a new memory file and wanting it available in search right away, without scanning the whole workspace. The path is resolved from your current working directory (like any shell command), not from --workspace . So if your workspace is at ./my_project , run from its parent: # Run from the parent of my_project/ memweave add my_project/memory/2026-04-25.md --workspace ./my_project --embedding-model text-embedding-3-small # Or cd into the workspace first, then the path is relative to CWD cd my_project memweave add memory/infrastructure.md --workspace . --embedding-model text-embedding-3-small # Force re-index even if the file hasn't changed memweave add my_project/memory/architecture.md --workspace ./my_project --embedding-model text-embedding-3-small --force List every file currently tracked in the index with its source label, chunk count, and whether it is evergreen. # Filter to a specific source namespace memweave files --workspace ./my_project --source sessions # Machine-readable output memweave files --workspace ./my_project --json # Table view memweave files --workspace ./my_project Example output: Path Source Chunks Evergreen memory/2026-0

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기