MCP 도구로 브라우저 제어 및 컴퓨터 사용 – Claude, Codex, Cursor와 함께 작동

hackernews | | 📦 오픈소스
#ai 딜 #claude #gemini #mcp #브라우저 제어 #오픈소스 #컴퓨터 사용
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

오픈 소스 프로젝트 'Talon'은 Claude Code에 웹 브라우저와 macOS 데스크톱을 제어할 수 있는 기능을 부여하여, 실제 크롬 브라우저를 통해 페이지 탐색이나 폼 작업을 자동화하고 마우스 및 키보드를 조작할 수 있게 해줍니다. 이 플러그인은 별도의 API 키 없이 60초 안에 설치 가능하며, 웹 스크래핑, QA 테스트, 데스크톱 자동화 등 다양한 작업에 활용될 수 있습니다. MCP 표준을 기반으로 구축되어 Cursor나 Codex 등 다른 AI 코딩 툴과도 호환되며, 크롬 확장 프로그램과 연동해 실시간으로 작업 과정을 모니터링하고 상호작용할 수 있는 것이 특징입니다.

본문

Give Claude Code eyes and hands — control your browser and your computer. Open source, no API keys, works in 60 seconds. # macOS / Linux curl -fsSL https://raw.githubusercontent.com/gettalon/talon-plugins/master/scripts/setup.sh | bash # Windows (PowerShell) irm https://raw.githubusercontent.com/gettalon/talon-plugins/master/scripts/setup.ps1 | iex Each tool gets MCP + Skills automatically: | Tool | MCP (browser) | Skills | Config | |---|---|---|---| | Codex | Yes | ~/.agents/skills/ | ~/.codex/config.toml | | Cursor | Yes | ~/.agents/skills/ | ~/.cursor/mcp.json | | Windsurf | Yes | ~/.agents/skills/ | ~/.windsurf/mcp.json | | Gemini CLI | Yes | ~/.gemini/commands/ | ~/.gemini/settings.json | | Claude Code | Yes | Full plugin marketplace | ~/.claude/plugins/ | What gets installed: - MCP: talon-browser — Chrome DevTools Protocol, 15 browser tools - Skills: gitlab-scrum, gitlab-sprint, gitlab-board, gitlab-wiki, ai-dispatch, autoresearch - Claude Code extra: computer-use, plugin marketplace with all plugins Then try it: Navigate to https://example.com and tell me what's on the page Take a screenshot of the current tab Click the "Learn more" link Stop there. You'll know if this is for you. | Plugin | What | Install | |---|---|---| | browser-control | 15 MCP tools for Chrome — read, click, fill, screenshot | /plugin install browser-control@gettalon-talon-plugins | | computer-use | macOS desktop automation — mouse, keyboard, windows | /plugin install computer-use@gettalon-talon-plugins | | ai-dispatch | Route tasks to 7 AI backends (Doubao, DeepSeek, GLM...) | /plugin install ai-dispatch@gettalon-talon-plugins | | gitlab-scrum | GitLab Scrum — issues, sprints, boards, wiki via glab | /plugin install gitlab-scrum@gettalon-talon-plugins | | channels | 22 channel adapters — WebSocket, Telegram, Discord, Slack, etc. | /plugin install hub@gettalon-talon-plugins | | autoresearch | Autonomous edit-test-measure loop | /plugin install autoresearch@gettalon-talon-plugins | Claude sees your browser, reads pages, fills forms, clicks buttons, takes screenshots — through real Chrome DevTools Protocol. Not headless, not simulated. Your actual Chrome. | Tool | What it does | |---|---| browser_navigate | Go to URLs, back/forward | browser_click | Click by selector, text, or accessibility ref | browser_type | Fill inputs, type text, keyboard shortcuts | browser_read_page | Page info, accessibility snapshot, extract content | browser_screenshot | Capture page or elements (compressed JPEG) | browser_execute_js | Run JavaScript in the page | browser_tabs | List, open, close, switch tabs | browser_scroll | Scroll, hover, drag and drop | browser_network | Monitor requests, set headers, go offline | browser_console | Read console logs and errors | browser_emulate | Device emulation, viewport, geolocation | browser_performance | Performance traces, Lighthouse audit, memory | browser_form | Fill entire forms, upload files, handle dialogs | browser_inspect | Highlight elements, box model, cookies | browser_wait | Wait for elements, network idle, page stable | Claude controls your Mac — move mouse, click, type, press keys, take screenshots, manage windows. Native Quartz events, not accessibility hacks. - Mouse — move, click, double-click, right-click, drag - Keyboard — type text, press keys, keyboard shortcuts - Screenshots — capture screen, specific windows, or regions - Windows — list, focus, resize, move windows - Clipboard — read and write clipboard content - System — display info, process list, volume control Route tasks to the best AI model. Doubao, DeepSeek, Kimi, MiniMax, GLM — all through one dispatch command. /plugin install ai-dispatch@gettalon-talon-plugins dispatch ark-code "review this code" # Doubao Seed 2.0 Code dispatch ark-minimax "analyze this" # MiniMax M2.5 dispatch glm "translate to Chinese" # GLM-5 dispatch ark-deepseek "complex reasoning" # DeepSeek V3.2 dispatch ark-kimi "long context analysis" # Kimi K2.5 | Backend | Model | Best for | |---|---|---| ark-code | Doubao Seed 2.0 Code | Code generation, review | ark-pro | Doubao Seed 2.0 Pro | General reasoning | ark-minimax | MiniMax M2.5 | Analysis, research | ark-kimi | Kimi K2.5 | Long context | ark-deepseek | DeepSeek V3.2 | Complex reasoning | glm | GLM-5 | Chinese language | ark-auto | Auto routing | Smart model selection | Full Scrum/Kanban workflow for GitLab — issues, sprints, boards, wiki — all from the terminal. /plugin install gitlab-scrum@gettalon-talon-plugins # Sprint planning /gitlab-sprint plan Sprint 2026-W14 # Issue management /gitlab-scrum create issue "Implement feature X" --label "To Do" # Board management /gitlab-board setup # Wiki with Mermaid diagrams /gitlab-wiki create "Architecture" with sequence diagram | Skill | What it does | |---|---| /gitlab-scrum | Issues, labels, milestones — core CRUD | /gitlab-sprint | Sprint lifecycle — create, populate, track, close | /gitlab-board | Kanban board setup and issue movement | /gitlab-wiki | Wiki p

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →