Claude Code의 자체 진화 기술 – v3 검증 완료
hackernews
|
|
💼 비즈니스
#claude
#claude code
#tip
#v3 검증
#개발 팁
#디자인 패턴
#자체 진화
요약
Claude Code를 위한 자가 진화형 스킬(Self-Evolving Skill)의 v3 버전 검증이 성공적으로 완료되었습니다. 이번 업데이트는 코드 생성 능력의 자율적 개선과 정확도 향상을 목표로 진행된 것으로 알려졌습니다. 검증 마무리에 따라 관련 기능의 안정적인 적용이 기대되며, 후속 성능 평가가 이어질 것으로 보입니다.
왜 중요한가
개발자 관점
검토중입니다
연구자 관점
검토중입니다
비즈니스 관점
검토중입니다
본문
English | 中文 A design pattern for Claude Code Skills that improve through use — growing more accurate and efficient over time, without bloating. Note Academic positioning: This pattern corresponds to Inter-test-time Context Evolution with Text-Feedback Governance in the self-evolving agent literature. See Gao et al. (2026) "A Survey of Self-Evolving Agents." Traditional Skills are static — an author packages them once, users invoke them repeatedly, and knowledge never grows. But in domains like database investigation, codebase analysis, and business system integration, an AI continuously discovers valuable domain knowledge during use — table relationships, query patterns, business rules, data characteristics. Without a way to persist this knowledge, every new session starts from zero, wasting both effort and context window. Is this pattern right for your use case? Ask two questions: - Will domain knowledge grow through use? - Does that growth have a natural ceiling? If both answers are yes, this pattern fits. skill-name/ ├── SKILL.md # Trigger conditions + governance protocol ├── scripts/ # Execution tools │ ├── core/ # Computation layer (decay model) │ │ ├── formulas.py # Atomic formulas │ │ ├── models.py # Composite models + config │ │ └── parser.py # Decay tag parser │ ├── decay_engine.py # CLI: init / scan / feedback / reset / inject / search │ └── *.py # Domain-specific tools └── references/ # Living knowledge base (AI-maintained) ├── _index.md # Routing table (.md # Topic files with decay-tagged entries The reference implementation is a Self-Evolving Skill for MySQL database investigation. Install it to see the pattern in action on your own database. - An AI coding agent (Claude Code, Cursor, Windsurf, Codex, etc.) - Node.js ≥ 18 and Python ≥ 3.8 pip install pymysql macOS / Linux: npx skills add 191341025/Self-Evolving-Skill --skill db-investigator Windows: npx skills add 191341025/Self-Evolving-Skill --skill db-investigator --copy -y --copy bypasses Windows symlink permission issues;-y skips interactive agent selection. Run setup.py from the installed skill directory: # Find your agent's skill path (one of these will exist): # .claude/skills/ .cursor/skills/ .windsurf/skills/ .continue/skills/ python /skills/db-investigator/scripts/setup.py The interactive wizard collects your MySQL connection details, tests the connection, and initializes the knowledge system. Tip: Or just start a conversation and ask a database question — if unconfigured, the skill will tell you exactly what to run. Start a Claude Code conversation and ask any database question. The skill activates automatically and begins evolving its domain knowledge through use. This is the core of the pattern. It prevents the knowledge base from degrading into noise. Gate 1 — VALUE Q: Can this knowledge be reused across sessions? → One-time result (e.g., "query returned 42 rows at 3pm") → REJECT → Reusable pattern or stable fact → PASS Gate 2 — ALIGNMENT Q: Does this contradict existing knowledge? → Contradiction found → CORRECT the existing entry (don't append) → Consistent → PASS Gate 3 — REDUNDANCY Q: Does this already exist, possibly worded differently? → Exists → MERGE into existing entry, or skip → Doesn't exist → PASS Gate 4 — FRESHNESS (write) Classify knowledge type and attach decay metadata: → confirmed= C0=1.0 --> → Six types: schema | business_rule | tool_experience | query_pattern | data_range | data_snapshot → High-decay types (data_range, data_snapshot): prefer rejection Gate 4 — FRESHNESS (read) Run confidence scan before using knowledge: → Tool computes C(t) based on time elapsed and feedback history → TRUST (C>=0.8): use directly → VERIFY (0.5<=C<0.8): use but flag for verification → REVALIDATE (C<0.5): verify with tools first Gate 4 — FRESHNESS (feedback) After operations that used knowledge: → Success → record positive feedback (slows future decay) → Failure → record negative feedback (accelerates decay) → After revalidation passes → reset to fresh state Gate 5 — PLACEMENT Q: Which file does this belong in? Which memory layer? → Existing topic → Add to that file → New topic → Only create a new file if 3+ related entries exist; update _index.md The most common outcome of the Five Gates is: do nothing. Most interactions don't produce knowledge worth storing. The protocol's primary job is to reject, not to accept. | Capability | Mechanism | |---|---| | Add knowledge | Must pass all five gates | | Correct errors | Gate 2 detects contradictions; fix in place | | Deduplicate | Gate 3 merges rather than appends | | Expire stale data | Gate 4 confidence decay model; tool-computed freshness with Bayesian feedback | | Maintain structure | Gate 5 + scaling rules control file granularity | | Level | Loaded when | Content | Change frequency | |---|---|---|---| | Level 1: frontmatter | Always, in system prompt | "When to use this Skill, how to behave" | Rarely changes | | Level 2: body | When Claude judges the task is relevant | "Which to