I'm building an AI agent that learns from every task

hackernews | 2026년 4월 14일 00:56 | 📦 오픈소스

#ai 모델 #anthropic #claude #gemini #llama #openai

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Nexus는 과업을 수행할 때마다 학습하여 시간이 지날수록 더 똑똑해지고 비용은 절감되는 오픈 소스 자율형 AI 에이전트입니다. 인지 과학의 이중 과정 이론에 기반한 아키텍처를 통해 빠른 스킬 실행과 느린 추론을 동적으로 조절하며, 100번째 과업에서 비용을 75% 이상 절감하는 효율성을 보입니다. Anthropic, OpenAI 등 다양한 LLM 제공자를 지원하고 보안 및 거버넌스 시스템을 탑재하여 안전하게 자체 호스팅할 수 있습니다.

본문

The AI agent that gets smarter and cheaper over time. Fully open-source. Self-hosted. Every LLM provider. Powered by Bun. Nexus is an autonomous AI agent that learns from every task. Unlike coding copilots tethered to an IDE or chatbot wrappers around a single API, Nexus uses a dual-process architecture inspired by cognitive science to continuously improve performance and reduce costs over time. # 1. Clone and install git clone https://github.com/your-org/nexus.git cd nexus bun install # 2. Run the setup wizard (recommended) bun run dev setup # Or configure manually echo "ANTHROPIC_API_KEY=sk-..." > .env # 3. Run bun run dev The setup wizard will guide you through provider selection, API key configuration, model selection, and budget settings. - Getting Started — Installation and quickstart - Configuration — Configuration options - Skills System — How Nexus learns from tasks - Modes — Create specialized agents - Memory System — Persistent knowledge base - Tools — Available tools - Architecture — System architecture - CLI Reference — CLI commands - FAQ — Frequently asked questions Inspired by Kahneman's "Thinking, Fast and Slow": - System 1: Fast, automatic skill execution (60-80% cheaper) - System 2: Slow, deliberate full LLM reasoning - Router assesses task risk and complexity automatically Autonomous skill creation and improvement: - Skill Creation — Learns from task trajectories - Skill Mutation — Self-mutates on failure - Wilson Confidence — Statistical skill evaluation - Auto Retirement — Removes underperforming skills - Task #1: Full reasoning → $0.15, 3 minutes - Task #100: Skill match → $0.04, 45 seconds - Task #1000: Internalized → $0.01, 10 seconds - Anthropic (Claude) - OpenAI (GPT) - Google Gemini - Ollama (local, free) - OpenRouter (200+ models) - Zero SDK dependencies — direct HTTP calls Drop a .md file in modes/ to create a specialized agent: - Coding — Software development - Research — Analysis and investigation - Code Review — Structured code review - DevOps — Infrastructure and deployment - Writing — Content creation - Prompt firewall (12 injection patterns) - Permission system (path + tool-level) - Audit logger (immutable trail) - Behavioral monitoring (anomaly detection) - Dynamic supervision (HITL approval) - Wiki knowledge base with FTS5 search - Semantic memory with vector embeddings - Episodic memory for task outcomes - User modeling (preferences, patterns) - Cross-session recall ┌─────────────────────────────────────────────────┐ │ CLI (Interactive REPL · Bun Runtime) │ ├─────────────────────────────────────────────────┤ │ Intelligence Layer │ │ ├── System 1/2 Dual-Process Router │ │ ├── Skill Store (Wilson Confidence) │ │ ├── Experience Learner (Reflect + Evolve) │ │ ├── Mode Manager (Zero-Code Specialization) │ │ └── Memory Manager (Wiki + Semantic) │ ├─────────────────────────────────────────────────┤ │ Governance Layer │ │ ├── Permission Guard │ │ ├── Policy Engine │ │ ├── Approval Queue │ │ ├── Budget Store │ │ ├── Audit Logger │ │ └── Behavioral Monitor │ ├─────────────────────────────────────────────────┤ │ Middleware Pipeline │ │ ├── Timing · Prompt Firewall · Budget Enforcer │ │ ├── Permission · Network · Supervision │ │ ├── Memory Context · Artifact Tracker │ │ └── Tool Compactor · Output Scanner · Logger │ ├─────────────────────────────────────────────────┤ │ Agent Core (Tool Dispatch + LLM Loop) │ ├─────────────────────────────────────────────────┤ │ Provider Abstraction (Zero SDK Dependencies) │ │ ├── Anthropic · OpenAI · Google · Ollama │ │ └── OpenRouter │ ├─────────────────────────────────────────────────┤ │ Runtime Layer │ │ ├── MCP Manager · Cron Scheduler │ │ └── Sandbox Manager │ └─────────────────────────────────────────────────┘ | Variable | Default | Description | |---|---|---| NEXUS_MODEL | anthropic:claude-sonnet-4-20250514 | Model to use | NEXUS_BUDGET | 2.0 | Budget per session in USD | NEXUS_HOME | .nexus/ | Directory for data | ANTHROPIC_API_KEY | — | Anthropic API key | OPENAI_API_KEY | — | OpenAI API key | GOOGLE_API_KEY | — | Google API key | OPENROUTER_API_KEY | — | OpenRouter API key | bun run dev setup Interactive wizard for: - Provider selection - API key configuration - Model selection - Budget setting - Skill installation bun run dev doctor Check configuration and diagnose issues. | Command | Description | |---|---| /help | Show available commands | /clear | Clear conversation history | /model | Show current model | /skills | List learned skills | /modes | List available modes | /mode | Switch to a mode | /stats | Show routing & learning stats | /wiki recall | Search wiki memory | /tools | List available tools | /exit | Exit Nexus | nexus/ ├── packages/ │ ├── core/ # Agent loop, middleware, tools, types │ ├── providers/ # Multi-provider LLM abstraction │ ├── intelligence/ # Skills, router, learner, modes │ ├── governance/ # Security, permissions, audit │ ├── protocols/ # MCP, A2A, Agent Cards │ └── runtime/ # Cron, sandbox, scheduling ├── apps/ │ ├── cli/ # Interactive CLI │ └── web/ # Web UI (planned) ├── docs/ # Documentation ├── modes/ # Zero-code modes │ ├── coding.md │ ├── research.md │ ├── code-review.md │ ├── devops.md │ └── writing.md └── .nexus/ # Runtime data ├── skills/ # Learned skills ├── memory/ # Semantic and episodic memory ├── wiki/ # Persistent knowledge base ├── audit/ # Audit logs ├── sessions/ # Session transcripts ├── cron/ # Scheduled jobs └── governance/ # Permissions, approvals, budgets Contributions are welcome! Please read our contributing guidelines before submitting PRs. MIT License — see LICENSE for details. - Inspired by cognitive science (Kahneman's dual-process theory) - Compatible with agentskills.io format - Built on Bun for performance - Uses Model Context Protocol for extensibility - Documentation - GitHub Issues - Discord (coming soon) - Twitter (coming soon)

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기