HN 표시: ClawMem – SOTA 로컬 GPU 검색 기능을 갖춘 오픈 소스 에이전트 메모리

hackernews | 2026년 3월 22일 09:15 | 📦 오픈소스

#ai 에이전트 #claude #mcp 서버 #로컬 gpu #메모리 검색 #오픈소스 #하드웨어/반도체

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

ClawMem은 Claude Code 등 AI 에이전트를 위한 오픈 소스 로컬 메모리 시스템으로, API 키 없이 최신 검색 증강 기술과 온디바이스 GPU 추론을 결합해 프로젝트 문서와 의사결정을 자동으로 관리합니다. BM25와 벡터 검색, 교차 인코더 재순위 등 다중 신호 검색(Multi-signal retrieval)과 사용자 피드백을 통한 메타데이터 진화 등의 기능을 제공합니다. Claude Code 훅이나 MCP 서버를 통해 통합되며, Linux와 macOS, WSL2 환경을 지원해 모든 AI 런타임 간에 메모리 저장소(Vault)를 공유할 수 있습니다.

본문

On-device memory for Claude Code and AI agents. Retrieval-augmented search, hooks, and an MCP server in a single local system. No API keys, no cloud dependencies. ClawMem fuses recent research into a retrieval-augmented memory layer that agents actually use. The hybrid architecture combines QMD-derived multi-signal retrieval (BM25 + vector search + reciprocal rank fusion + query expansion + cross-encoder reranking), SAME-inspired composite scoring (recency decay, confidence, content-type half-lives, co-activation reinforcement), MAGMA-style intent classification with multi-graph traversal (semantic, temporal, and causal beam search), and A-MEM self-evolving memory notes that enrich documents with keywords, tags, and causal links between entries. Pattern extraction from Engram adds deduplication windows, frequency-based durability scoring, and temporal navigation. Integrates via Claude Code hooks, an MCP server (works with any MCP-compatible client including OpenClaw), or a native OpenClaw ContextEngine plugin. All paths write to the same local SQLite vault. A decision captured during a Claude Code session shows up immediately when an OpenClaw agent picks up the same project. TypeScript on Bun. MIT License. ClawMem turns your markdown notes, project docs, and research dumps into persistent memory for AI coding agents. It automatically: - Surfaces relevant context on every prompt (context-surfacing hook) - Bootstraps sessions with your profile, latest handoff, recent decisions, and stale notes - Captures decisions from session transcripts using a local GGUF observer model - Generates handoffs at session end so the next session can pick up where you left off - Learns what matters via a feedback loop that boosts referenced notes and decays unused ones - Guards against prompt injection in surfaced content - Classifies query intent (WHY / WHEN / ENTITY / WHAT) to weight search strategies - Traverses multi-graphs (semantic, temporal, causal) via adaptive beam search - Evolves memory metadata as new documents create or refine connections - Infers causal relationships between facts extracted from session observations - Detects contradictions between new and prior decisions, auto-decaying superseded ones - Scores document quality using structure, keywords, and metadata richness signals - Boosts co-accessed documents — notes frequently surfaced together get retrieval reinforcement - Decomposes complex queries into typed retrieval clauses (BM25/vector/graph) for multi-topic questions - Cleans stale embeddings automatically before embed runs, removing orphans from deleted/changed documents - Transaction-safe indexing — crash mid-index leaves zero partial state (atomic commit with rollback) - Deduplicates hook-generated observations within a 30-minute window using normalized content hashing, preventing memory bloat from repeated hook output - Navigates temporal neighborhoods around any document via the timeline tool — progressive disclosure from search to chronological context to full content - Boosts frequently-revised memories — documents with higher revision counts get a durability signal in composite scoring (capped at 10%) - Supports pin/snooze lifecycle for persistent boosts and temporary suppression - Manages document lifecycle — policy-driven archival sweeps with restore capability - Auto-routes queries via memory_retrieve — classifies intent and dispatches to the optimal search backend - Syncs project issues from Beads issue trackers into searchable memory Runs fully local with no API keys and no cloud services. Integrates via Claude Code hooks and MCP tools, or as an OpenClaw ContextEngine plugin. Both modes share the same vault for cross-runtime memory. Works with any MCP-compatible client. - Entity resolution + co-occurrence graph — LLM entity extraction with quality filters, type-agnostic canonical resolution within compatibility buckets (extensible type vocabulary), IDF-based entity edge scoring, co-occurrence tracking, entity graph traversal for ENTITY intent queries - MPFP graph retrieval — Multi-Path Fact Propagation with meta-path patterns per intent, hop-synchronized edge cache, Forward Push with α=0.15 teleport probability. Replaces single-beam traversal for causal/entity/temporal queries. - Temporal query extraction — regex-based date range extraction from natural language queries ("last week", "March 2026"), wired as WHERE filters into BM25 and vector search - 4-way parallel retrieval — temporal proximity and entity graph channels added as parallel RRF legs in query tool (Tier 3 only), alongside existing BM25 + vector channels - 3-tier consolidation — facts to observations (auto-generated, with proof_count and trend enum) to mental models. Background worker synthesizes clusters of related observations into consolidated patterns. - Observation invalidation — soft invalidation (invalidated_at/invalidated_by/superseded_by columns). Observations with confidence ≤ 0.2 after contradiction are filtered fro

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기