Xmemory: RAG 및 하이브리드 RAG에 대한 구조적 AI 메모리 벤치마킹

hackernews | | 📰 뉴스
#머신러닝/연구
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

기존 AI 메모리 방식이 실제 운영 환경에서 필요한 정확한 팩트, 상태 관리 등의 기능을 수행하는 데 한계가 있다는 점을 지적합니다. 이를 해결하기 위해 스키마 기반 메모리를 제안하며, 정보를 기록하는 단계에서 객체와 필드를 분리하고 검증 과정을 거치는 반복적인 추출 방식을 도입했습니다. 이 방식은 검색 시점에 추론하는 대신 기록 시점에 정보를 해석하여, 검증된 레코드에 대한 제약된 쿼리가 가능하게 만듭니다.

본문

Computer Science > Artificial Intelligence Title:From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction View PDF HTML (experimental)Abstract:Persistent AI memory is often reduced to a retrieval problem: store prior interactions as text, embed them, and ask the model to recover relevant context later. This design is useful for thematic recall, but it is mismatched to the kinds of memory that agents need in production: exact facts, current state, updates and deletions, aggregation, relations, negative queries, and explicit unknowns. These operations require memory to behave less like search and more like a system of record. This paper argues that reliable external AI memory must be schema-grounded. Schemas define what must be remembered, what may be ignored, and which values must never be inferred. We present an iterative, schema-aware write path that decomposes memory ingestion into object detection, field detection, and field-value extraction, with validation gates, local retries, and stateful prompt control. The result shifts interpretation from the read path to the write path: reads become constrained queries over verified records rather than repeated inference over retrieved prose. We evaluate this design on structured extraction and end-to-end memory benchmarks. On the extraction benchmark, the judge-in-the-loop configuration reaches 90.42% object-level accuracy and 62.67% output accuracy, above all tested frontier structured-output baselines. On our end-to-end memory benchmark, xmemory reaches 97.10% F1, compared with 80.16%-87.24% across the third-party baselines. On the application-level task, xmemory reaches 95.2% accuracy, outperforming specialised memory systems, code-generated Markdown harnesses, and customer-facing frontier-model application harnesses. The results show that, for memory workloads requiring stable facts and stateful computation, architecture matters more than retrieval scale or model strength alone. Bibliographic and Citation Tools Code, Data and Media Associated with this Article Demos Recommenders and Search Tools arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →