HN 표시: Aura – Planner/Worker 아키텍처를 갖춘 데스크톱 AI 오케스트레이션 IDE

hackernews | 2026년 5월 11일 05:59 | 🔧 개발도구

#ai 모델 #anthropic #claude #gemini #gpt-4

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

데스크톱 AI 오케스트레이션 IDE인 Aura는 페어 프로그래밍 방식을 모방하여 프로젝트 전반을 이해하고 코드를 수정하거나 변경 사항을 제안합니다. OpenAI와 DeepSeek 등 다양한 AI 백엔드를 지원하며, 변경 내용을 적용하기 전에 diff 승인 과정을 거쳐 안전성을 높입니다. 또한 기획을 담당하는 Planner와 실제 코드를 실행하는 Worker라는 두 에이전트 시스템을 통해 기술 명세서 작성부터 파일 수정 및 검증까지 자동화합니다.

본문

Desktop AI Orchestration IDE - pair programming with full workspace awareness. Aura is a desktop chat application where you talk to an AI agent that reads your project, searches your codebase, proposes changes, and applies them with diff approval. It supports DeepSeek, OpenAI, Anthropic, Google Gemini, and OpenRouter as AI backends, with a local Ollama vision model for screenshot preprocessing. Built with PySide6 (Qt for Python). Demo: A full Planner -> Worker cycle - spec writing, dispatch, code editing with diff approval, and auto-commit. - Screenshots - Features - Planner / Worker Architecture - Comprehensive Tools Suite - Diff Approval & Backups - Git Integration - Web Research - Terminal Commands - Vision Preprocessing - Dynamic / Self-Extending Tools - Sandbox Execution - Hardware-Tethered API Key Encryption - Codebase Index (BM25 Semantic Search) - Session Cost Tracking - Thinking Modes - Custom System Prompts - Separate Worker Temperature - Read-Only Mode - Auto-Dispatch & Auto-Approve - Conversation Persistence - Keyboard Shortcuts & Slash Commands - Cross-Platform - Supported Providers - Installation - First Launch Checklist - Usage - Configuration - Safety Model - Known Limitations - Architecture - Project Structure - Development - Dependencies - License Left: Main interface with three-pane layout - workspace tree, chat view, and worker activity panel. Right: Diff approval dialog - every file change is reviewed before being applied. Aura uses a two-agent system inspired by pair programming: - Planner - Reads targeted parts of your codebase, asks clarifying questions when needed, gives a short plan summary, and writes a precise technical specification for the Worker. - Worker - Executes the specification with read/write filesystem access. It reads target files, applies edits, runs validation commands, and reports back a summary. Both agents can use different models and different reasoning depths from the same provider. For example, use a fast/cheap model for the Planner and a more capable model for the Worker. The Planner is tuned for speed: it keeps visible planning brief and puts the important implementation detail into the dispatch_to_worker spec. The Spec Edit dialog lets you modify that spec before handing it to the Worker - giving you full control over what gets implemented and how. The AI has a rich set of tools - all sandboxed to your workspace root (the AI cannot escape the project directory). Tools are grouped by category: | Tool | Description | |---|---| read_file | Read a UTF-8 text file from the workspace (capped at 200 KB) | list_directory | List files and subdirectories (hides .git , __pycache__ , .venv , node_modules ) | glob | Recursively find files matching a glob pattern (capped at 200 results) | read_file_outline | Read a file's structural outline - class names, function signatures, imports - via AST (Python) or heuristics (other languages) | grep_search | Search file contents with string or regex matching | find_usages | Find all usages of a symbol across the workspace using word-boundary matching - safe for refactoring | search_codebase | BM25 semantic search - ranks entire files by relevance to a natural-language query. Uses a local inverted index over up to 1500 files (128 KB each, 30+ extensions). Perfect for rediscovering files/functions when conversation context has been pruned. | | Tool | Description | |---|---| write_file | Create or overwrite a file. Triggers diff approval and automatic backup. | edit_file | Surgically replace code via a Search Block (context + target code). Uses fuzzy matching - minor whitespace, indentation, or newline differences are tolerated. Triggers diff approval and backup. | edit_symbol | AST-based structured editing for Python files. Replace a named function, class, or method by specifying its name - finds the symbol in the parsed AST and replaces its entire definition. No whitespace-matching issues. Supports function , class , and method (with class_name ). | | Tool | Description | |---|---| git_status | Show working tree status - current branch, remote tracking info, staged/unstaged/untracked files | git_diff | Show diff of unstaged or staged changes | git_log | Show recent commit history (with optional file filter) | git_show | Show the full diff and metadata for a specific commit | git_log_file | Show commit history for a single file, following renames | git_branch_list | List all local branches with tracking information | git_stash_list | List all stashes | git_stash_show | Show the diff of a specific stash | | Tool | Description | |---|---| web_search | Search the web via Tavily. Returns top results with snippets. | web_fetch | Fetch and parse the content of a specific URL using BeautifulSoup | run_research | Dispatches a background sub-agent that autonomously searches the web (Tavily) and scrapes pages (BeautifulSoup) to produce a synthesized report. Ideal for looking up documentation, debugging unfamiliar errors, or researching libraries. | | Tool | Description | |---|---| run_terminal_command | Execute shell commands in your workspace with real-time streaming output, cancellation support, and configurable timeout. The AI is instructed to run linters, type checkers, and test suites after making changes. | | Tool | Description | |---|---| update_todo_list | Maintains a live progress tracker with pending -> active -> done statuses. Displayed in the Worker Activity panel for real-time visibility. | | Tool | Description | |---|---| dispatch_to_worker | Planner-only. Hands off a spec to the Worker for execution. Only available when Planner/Worker mode is enabled. | The conversation loop includes a circuit breaker that detects when the same tool call produces the identical failure output 3 or more times consecutively. A warning is injected into the tool result, alerting the AI that it is likely in a loop - preventing infinite retry cycles. Every write_file , edit_file , or edit_symbol call triggers a diff approval dialog before any bytes touch disk: - Approve - Apply this change - Reject - Skip this change - Approve All - Approve this and all subsequent writes in this turn - Reject All - Reject this and all further writes Before any write, existing files are automatically backed up to .aura/backups// in your workspace. Every backup carries an ISO-8601 timestamp, so you can always recover previous versions. If your workspace is a git repository, Aura provides deep integration: - Auto-commit - After the Worker completes a set of file changes, Aura stages and commits them with an AI-generated commit message derived from the dispatch goal and Worker summary. /undo slash command - Soft-resetsHEAD~1 , reverting the last commit while keeping changes in the working directory.git_init - Initialises a new git repository in the workspace if one doesn't exist.- Snapshot / Restore - snapshot() creates a lightweight checkpoint commit;restore_to_snapshot() returns to it. Useful for checkpointing experimental changes. - Full git tool access - Both the Planner and Worker can inspect repository state before and after changes using the complete git tool suite (status, diff, log, show, branch, stash). - Automatic .gitignore -.aura/ is automatically added to.gitignore on startup. The run_research tool dispatches a background sub-agent that: - Generates search queries from your question - Searches the web via Tavily - Fetches and parses the most relevant pages with BeautifulSoup - Produces a synthesised report Perfect for looking up documentation, debugging unfamiliar error messages, or researching third-party libraries without leaving your IDE. run_terminal_command executes shell commands in your workspace directory with: - Real-time streaming output - See output as it's produced, not just when the command finishes - Cancellation support - Stop long-running commands mid-execution - Timeout - Configurable max execution time Unlike a regular terminal, the AI is instructed to run linters, type checkers, and test suites after making changes - closing the loop between edit and validation. Paste screenshots (Ctrl+V ) or drag-and-drop images into the chat. The input panel handles both clipboard paste and file drag-and-drop. A local Ollama vision model (llama3.2-vision ) describes images in detail so the AI can reason about visual content - error dialogs, UI glitches, diagrams, and more. Additionally, providers/models that natively support vision (GPT-4o, Claude, Gemini Flash) can receive images directly, bypassing local preprocessing. Aura supports three execution modes for terminal commands and dynamic tools: | Mode | Description | |---|---| | host | Run directly on the host (no isolation). Default. | | docker | Run inside a Docker container with resource limits: 2 GB memory, 2 CPU cores, PID limit of 200, all Linux capabilities dropped, no-new-privileges enabled. Dynamic tools run with a read-only root filesystem; terminal commands run read-write. No access to the host filesystem outside the workspace. Network is enabled for terminal commands but disabled for dynamic tools. | | wasm | Reserved for future WASM runtime. | Configurable via the Settings dialog. Aura checks Docker availability at startup and falls back gracefully if not found. API keys are never stored in config.json . Instead: - Environment variables take precedence - standard DEEPSEEK_API_KEY ,OPENAI_API_KEY , etc. - Encrypted storage - Keys can be stored on disk at ~/.config/Aura/keys.json encrypted with Fernet (symmetric encryption) using a machine-derived key (MAC address + username). File permissions are set to0o600 . - Auto-migration - Legacy plaintext keys are automatically migrated to encrypted form on first access. Keys stored via the Settings dialog (gear icon) are encrypted immediately. The Settings status indicator shows green when a key is found, red when missing. The search_codebase tool builds and queries a local BM25 inverted index over your workspace files: - Up to 1,500 files, each up to 128 KB - 30+ file extensions covered: Python, JavaScript, TypeScript, Rust, Go, Java, C/C++, Ruby, PHP, Swift, Kotlin, Scala, YAML, JSON, TOML, Markdown, HTML, CSS, SQL, Lua, Zig, and more - Ranks files by semantic relevance to a natural-language query - not keyword matching This is especially valuable when conversation context has been pruned and the AI needs to rediscover where a particular function or class lives. The status bar displays live token usage and estimated cost: - Cache hit tokens / Cache miss tokens / Output tokens - Estimated USD cost using embedded pricing tables per model - Resets per conversation session Pricing is tracked per-model using rates in aura/config.py - both for built-in models and dynamically fetched ones (especially via OpenRouter, which returns real-time pricing). Choose reasoning depth for each agent independently: | Mode | Description | |---|---| | Off | Standard response - no extended reasoning | | High | Extended reasoning - the model spends more compute before responding | | Max | Maximum reasoning - best for complex architectural decisions or tricky bugs | Configure separately for Planner, Worker, and Single (non-Planner/Worker) modes. Works with models that support extended thinking (e.g., DeepSeek R1, Claude Sonnet). Defaults are intentionally split for responsiveness and quality: - Planner Thinking defaults to Off so planning and spec creation feel snappy. - Worker Thinking defaults to High so implementation work gets more reasoning budget. Configure separate system prompts via the Settings dialog: - System Prompt - Used in Single mode (default conversation) - Planner System Prompt - Used for the Planner agent - Worker System Prompt - Used for the Worker agent Tailor each agent's behaviour, style, constraints, and persona to your workflow. - Worker temperature: defaults to 0.1 - deterministic, consistent when applying code changes - Planner / Single temperature: defaults to 0.7 - creative when reasoning about architecture Both are configurable in the Settings dialog (range 0.0-2.0). Toggle the Read-Only button in the toolbar to strip all write tools. The AI can still read, search, and advise, but cannot modify any files. Perfect for: - Code review sessions - Exploring an unfamiliar codebase - Asking questions without risk of unintended modifications Optional settings for faster workflows: - Auto-Dispatch - Skips the spec review dialog; the spec is dispatched to the Worker immediately after the Planner writes it. - Auto-Approve - Skips the diff approval dialog; all file writes are applied automatically. Toggle these from the toolbar or Settings dialog. In the toolbar, a blue/bold label means the toggle is enabled; a dim label means it is disabled. Use with caution - these are best for trusted, low-risk changes. - Chats are saved to .aura/conversations/ as JSON files - Restore last session on launch (configurable) - Open past conversations from the toolbar - Start fresh at any time via the "New Conversation" button | Shortcut | Action | |---|---| | Ctrl+Enter | Send message | | Ctrl+V (in editor) | Paste image from clipboard | | Command | Description | |---|---| /undo | Soft-resets the last git commit (requires git repo). Quickly revert the AI's last change. | Aura runs on Windows, macOS, and Linux via PySide6 (Qt for Python). The same interface, the same features, everywhere. Aura supports five AI providers. You choose one per session via the toolbar dropdown, then select any model from that provider's catalogue. The Planner and Worker always use the same provider but can be assigned different models and thinking modes. | Provider | Base URL | Env Var | |---|---|---| | DeepSeek | https://api.deepseek.com | DEEPSEEK_API_KEY | | OpenAI | https://api.openai.com/v1 | OPENAI_API_KEY | | Google Gemini | https://generativelanguage.googleapis.com/v1beta/openai/ | GEMINI_API_KEY | | Anthropic | https://api.anthropic.com/v1 | ANTHROPIC_API_KEY | | OpenRouter | https://openrouter.ai/api/v1 | OPENROUTER_API_KEY | Aura can dynamically fetch models from provider APIs: - OpenRouter - Returns the full model catalogue with real-time pricing per model. Models are automatically added to the selection dropdown with up-to-date pricing. - DeepSeek, OpenAI, Google - Uses the OpenAI-compatible /models endpoint. Pricing for recognised models is drawn from the embedded pricing tables; unknown models default to $0. Fetched models are cached to disk (~/.config/Aura/models_cache.json ) and reloaded on startup, so you don't need to fetch every launch. Tip: Model availability and pricing change frequently. Run a fetch (via the provider dropdown menu) to refresh the catalogue. For the latest pricing, also check each provider's official documentation. - Python 3.10 or later - An API key for at least one supported provider (see API Key Setup) - (Optional) Ollama running locally with llama3.2-vision for screenshot preprocessing - (Optional) Git for auto-commit and /undo support - (Optional) Docker

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기