HN 표시: King Louie – 20개의 에이전트 도구를 갖춘 데스크톱 AI, 클라우드 필요 없음

hackernews | 2026년 4월 17일 05:32 | 📦 오픈소스

#anthropic #claude #command r #gemini #mistral #ai 모델 #gpt-5

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

오픈소스 기반의 크로스플랫폼 AI 채팅 데스크톱 앱인 King Louie가 공개되었다. OpenAI, Anthropic, Google Gemini, Groq, Mistral, Ollama 등 14개 이상의 LLM 제공사를 지원하며, 키워드·정규식·슬래시 명령어 기반의 규칙 라우팅과 AI 기반 자동 모델 선택 기능을 제공한다. 에이전트는 Bash, Git, WebSearch, Browser, SpawnAgent, BackgroundTask 등 20개 이상의 내장 도구를 활용하여 파일 편집, 웹 검색, 브라우저 자동화, 하위 에이전트 생성, 백그라운드 작업 실행 등이 가능하며, Anthropic 캐싱을 통해 입력 토큰 비용을 50~90% 절감할 수 있다. Telegram, Discord, Slack 연동, 시스템 앱 자동 감지, 워크플로우 엔진, 크론 스케줄링, 음성 TTS 등 다양한 기능을 지원한다.

본문

An open-source, cross-platform AI chat desktop app. Bring your own API keys. Chat with any LLM. Run agents. Connect bots to Telegram, Discord, and Slack. Just pick your platform and run the installer. That's it. | Platform | Link | What to do | |---|---|---| | Windows | Download .exe installer | Run the .exe → click "Install" → done | | macOS | Download .dmg | Open the .dmg → drag to Applications → done | | Linux | Download .AppImage | chmod +x the file → double-click → done | Don't see your platform? Check the all downloads page for .deb , other architectures, and older versions. On first launch, the onboarding wizard walks you through selecting a provider and entering your API key. - Multi-Provider LLM Support — OpenAI, Anthropic, Google Gemini, Groq, Mistral, Ollama (local), OpenRouter, x.AI, DeepSeek, Qwen, Together, Fireworks, and Cohere - Smart LLM Routing — Rule-based dynamic model selection routes messages to different providers based on keywords, regex patterns, or slash-command prefixes - LLM-Powered Model Router — AI-driven model selection that automatically picks the best provider/model per task based on message content, cost, speed, and quality preferences - Prompt Caching — Anthropic cache_control blocks on system prompts reduce input token costs by 50–90% on multi-turn conversations, with cache-aware cost tracking - Extended Thinking — Claude 3.7+ models can use extended thinking with configurable budget tokens for deeper reasoning on complex tasks - Agentic Tool Use — Agents can execute shell commands, read/write/edit files, search the web, automate browsers, and more - Agent Streaming — Real-time token-by-token streaming during agent loop iterations instead of waiting for the full response - Workflow Engine — Durable, multi-step workflow execution with pause/resume, parallel task execution, dependency ordering, and persistent state across sessions - Planner Agent — Decomposes high-level goals ("build a REST API with auth and tests") into structured task graphs that the workflow engine executes automatically - Dynamic Sub-Agents — Agents can spawn specialized sub-agents mid-execution to handle subtasks with different models, tools, or system prompts - Background Tasks — Spawn agent tasks that run asynchronously in the background while the main conversation continues, with output logging and status checking - Worktree Isolation — Background tasks can run in isolated git worktrees to prevent file conflicts with the main workspace - Advisor Mode — Optional second-model code review that automatically reviews agent-generated code changes for bugs, security issues, and performance problems - Multi-Agent Orchestration — Run agents in parallel, serial, or dependency-based workflows - 20+ Built-in Tools — Bash, Read, Write, Edit, MultiEdit, Glob, Grep, Git, WebSearch, WebFetch, Browser, ToolSearch, BackgroundTask, TaskStatus, SpawnAgent, and more - MultiEdit — Batch edit multiple files in a single tool call with cascading failure isolation and per-file content caching - Deferred Tool Loading — Core tools load inline; others are deferred behind a ToolSearch meta-tool that supports keyword search and exact selection — stabilizes prompts for caching - MCP Support — Model Context Protocol client with stdio transport connects to any MCP server, automatically registering its tools into the tool registry - Git Safety Guards — Blocks --amend ,--force ,--no-verify , interactive flags,git add . , and sensitive file patterns (.env, credentials, keys) - Structured Diffs — Edit and Write tools generate unified diffs with line stats, displayed as colored diff blocks in the UI - Git Context Injection — Current branch, working tree status, and recent commits are automatically injected into the system prompt - API-Native Context Compaction — Clears old tool result content when approaching token limits (Anthropic provider), with embedding-based fallback for other providers - Semantic Context Assembly — Dynamic per-turn tool and prompt section selection via embeddings, cutting token overhead by 30–60% - Result Persistence — Oversized tool results are persisted to disk with a preview marker; the model can Read the full file if needed - System App Discovery — Auto-detects installed desktop applications (Excel, Photoshop, VS Code, etc.) so agents use local software instead of generating content via LLM - Extensible Skill System — Install, remove, enable, and pin custom skill plugins - Mesh Networking — Secure peer-to-peer communication between King Louie instances across machines - Channel Integrations — Bridge conversations to Telegram, Discord, and Slack bots - Cron Scheduling — Schedule recurring or one-time agent tasks with cron expressions - Semantic Memory — Embedding-based memory with hot/warm/cold tiering and recall - Voice / TTS — System TTS or ElevenLabs for voice responses - Webhooks — HTTP endpoints for external automation triggers - Command Palette — Ctrl+K opens a searchable command palette for quick access to all actions, commands, and settings - Keyboard Shortcuts — Ctrl+N (new chat),Ctrl+L (clear input),Ctrl+, (settings),Ctrl+Shift+E (export) - Chat Search — Real-time search box in the sidebar filters chats by title and preview text - Thinking Indicator — Animated "Thinking..." appears while waiting for the LLM to respond, replaced when streaming begins - Agent Progress Bar — Shows iteration count, current tool, and elapsed time during agent execution - Copy Code Button — One-click copy button on code blocks with language badge and "Copied!" feedback - Diff Display — Edit/Write tool results render as syntax-highlighted colored diffs instead of raw JSON - Retry on Error — Failed messages show a "Retry" button to resend without retyping - Markdown Export — Export conversations as readable Markdown with collapsible tool results - Welcome Card — First-run quick-start tips for new users (API key setup, agent mode, commands) - Dark Theme UI — Two-pane chat interface with syntax highlighting, markdown rendering, and image attachments - Cross-Platform Builds — Windows (NSIS), macOS (DMG), and Linux (AppImage/DEB) via GitHub Actions npm install npm start | Provider | Models | Local | |---|---|---| | OpenAI | GPT-4o, GPT-5, o1, o3-mini, etc. | No | | Anthropic | Claude Sonnet 4, Opus, Haiku, etc. | No | | Google Gemini | Gemini 2.0 Flash, 2.5 Pro, etc. | No | | Groq | Llama, Mixtral (ultra-fast inference) | No | | Mistral | Mistral Large, etc. | No | | Ollama | Any Ollama-hosted model | Yes | | OpenRouter | Multi-provider router | No | | x.AI | Grok 3, Grok 3 Mini | No | | DeepSeek | DeepSeek Chat | No | | Qwen | Qwen Plus | No | | Together AI | Llama, open-source models | No | | Fireworks AI | Llama, fast inference | No | | Cohere | Command R+ | No | Configure providers and API keys in Settings. King Louie can automatically route messages to different LLM providers based on configurable rules. Instead of manually switching providers, define rules once and let the router pick the best model for each task. - Go to Settings > Smart Routing - Toggle Enable smart routing on - Add rules — each rule has a condition (what to match) and a target (which provider/model to use) - Rules are evaluated in priority order; the first match wins - If no rule matches, the standard inference tier is used as a fallback | Type | Description | Example | |---|---|---| | Keyword | Case-insensitive substring match (comma-separated, OR logic) | documentation, write docs | | Regex | Regular expression test against the message | \b(refactor|redesign)\b | | Prefix | Slash-command at the start of the message (prefix is stripped before sending to the LLM) | /code | | Rule Name | Condition | Target | |---|---|---| | Design with Claude | Keywords: design, architect, plan feature | Anthropic / claude-sonnet-4 | | Docs with GPT | Keywords: documentation, write docs, readme | OpenAI / gpt-4o-mini | | Code prefix | Prefix: /code | OpenAI / gpt-4o | | Agent-only coding | Keywords: implement, build (agent mode only) | Anthropic / claude-sonnet-4 | With these rules, typing "design a new auth system" automatically routes to Claude, while "write docs for the API" goes to GPT-4o-mini. Typing /code implement a parser routes to GPT-4o with the /code prefix stripped from the prompt. - Priority — Reorder rules with up/down arrows; lower position = higher priority - Enabled — Toggle individual rules on/off without deleting them - Agent mode only — Rule only applies when agent mode is active Beyond rule-based routing, King Louie can use AI to automatically select the best model for each task. - Go to Settings > Workflows - Enable LLM-Powered Routing - Configure your preferences: - Cost Sensitivity — Low (prefer quality), Medium, or High (prefer cheap) - Speed Priority — Low, Medium, or High (prefer fast) - Quality Priority — Low, Medium, or High - A fast, cheap classifier model analyzes each message and picks the best provider/model from your configured providers The router maintains a cache of recent classifications to avoid redundant API calls. It falls back to rule-based routing or tier defaults if classification fails. | Approach | Best For | |---|---| | Tier-based | Simple setups — one model for everything | | Rule-based (Smart Routing) | Predictable patterns — always route /code to GPT-4o | | LLM-powered | Dynamic workloads — let AI decide based on task content | All three can coexist: LLM routing is tried first, then rule-based, then tier defaults. King Louie includes a durable workflow engine for executing complex, multi-step goals. Instead of manually breaking work into prompts, describe the outcome you want and let the system figure out the steps. - Go to Settings > Workflows - Enter a goal (e.g., "Build a REST API for user authentication with tests and documentation") - Click Plan & Execute The system: - Runs the Planner Agent to decompose the goal into a structured task graph - Creates a durable workflow with dependency ordering and parallel groups - Executes tasks through the appropriate agents (code-writer, code-explorer, main) - Streams progress events to the UI in real-time | Status | Meaning | |---|---| pending | Created but not started | running | Tasks are being executed | paused | Execution suspended — can be resumed | completed | All tasks finished successfully | failed | A critical task failed | cancelled | Manually cancelled by user | Workflows persist to disk and survive app restarts. A workflow that was running when the app closed will resume as paused on next launch. The planner outputs a JSON task graph with: - Tasks — Each with a title, description, assigned agent, and priority - Dependencies — Tasks only run after their dependencies complete - Parallel groups — Independent tasks execute concurrently - Model preferences — Tasks can suggest specific models (e.g., Gemini for research, Opus for deep reasoning) From the Workflows panel, you can: - Resume a paused workflow - Pause a running workflow - Cancel a workflow entirely - Delete a workflow and its saved state - Plan Only — Generate the task graph without executing it Agents can spawn specialized sub-agents mid-execution using the SpawnAgent tool. This enables recursive problem-solving — when an agent hits a subtask that needs different capabilities, it creates a new agent for it. User: "Refactor the auth module and update the docs" Main Agent: → SpawnAgent(agentId: "code-writer", task: "Refactor auth module to use JWT") → SpawnAgent(agentId: "code-writer", task: "Update API documentation to reflect auth changes") | Parameter | Description | |---|---| task | The instruction for the sub-agent (required) | agentId | Which agent to use: main , code-explorer , code-writer , planner | model | Override the model (e.g., gpt-4o , claude-sonnet-4-20250514 ) | provider | Override the provider (e.g., openai , anthropic , gemini ) | maxIterations | Max tool iterations (default: 10) | systemPromptAppend | Additional instructions for the sub-agent | tools | Restrict to specific tools (e.g., ["Read", "Grep"] ) | Sub-agents run in their own conversation context and return results inline to the parent agent. King Louie auto-detects installed desktop applications on your system and makes them available to agents. When a task can be done with local software (creating a spreadsheet, editing an image), agents will use the installed app instead of trying to generate content through the LLM. The discovery engine checks platform-specific locations: | Platform | Detection Method | |---|---| | Windows | PATH, Registry (COM), Start Menu, Program Files | | macOS | /Applications , Spotlight (mdfind ), PATH | | Linux | .desktop files, PATH | Categories include office (Excel, Word, LibreOffice), development (VS Code, Cursor), browsers, graphics (Photoshop, GIMP, Figma), media (OBS, VLC, FFmpeg), communication (Slack, Discord, Teams, Zoom), and more. Go to Settings > System Apps to: - View all discovered apps grouped by category with their launch commands - Re-scan to refresh after installing new software - Add custom apps for software in non-standard locations - Remove custom app entries Custom apps persist across restarts and are merged with auto-detected apps in the agent system prompt. The discovered app list is injected into every agent's system prompt. When you ask "create a spreadsheet of Q1 sales data", the agent will: - Generate the .xlsx file content using a library - Launch Excel (or whatever spreadsheet app is installed) to open it Instead of trying to render a table in chat or generating a CSV via the LLM. Agents have access to a suite of tools that can be individually approved or auto-approved: | Tool | Description | |---|---| Bash | Execute shell commands (platform-aware) | Read | Read file contents | Write | Create or overwrite files (generates diff for overwrites) | Edit | Exact string replacement in files (generates unified diff) | MultiEdit | Batch edit multiple files in a single call with cascading failure isolation | Grep | Regex content search across files | Glob | File pattern matching | Git | Git operations with safety guards (blocks --amend, --force, --no-verify, sensitive files) | WebSearch | Search the web (Brave, Tavily) | WebFetch | Fetch and parse web pages | Browser | Headless browser automation via CDP | ToolSearch | Search and load deferred tool schemas on demand (keyword, exact, or prefix match) | SpawnAgent | Dynamically spawn sub-agents with different models, tools, or specializations | BackgroundTask | Spawn agent tasks that run asynchronously with optional worktree isolation | TaskStatus | Check status, read output, list, or stop background tasks | Cron | Manage scheduled tasks | RemoteDispatch | Dispatch tasks to remote King Louie peers on the mesh network | AskUser | Request user input during execution | Skill | Invoke installed skill plugins | RequestTools | Legacy escape hatch for requesting additional tools mid-conversation | By default, the browser tool launches with an isolat

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기