VisionClaude – iPhone 및 Meta Ray-Ban 안경용 오픈 소스 AI 비전
hackernews
|
|
📦 오픈소스
#ai 딜
#ai 비전
#anthropic
#claude
#iphone
#meta ray-ban
#오픈소스
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
오픈소스 프로젝트 ‘VisionClaude’가 공개되어 사용자가 아이폰이나 메타 레이밴 스마트 글래스를 통해 AI 모델 ‘클로드(Claude)’의 시각을 활용할 수 있게 되었습니다. 이 도구는 카메라로 촬영한 피사체를 실시간으로 분석하고 음성으로 대화할 수 있으며, macOS 게이트웨이 서버를 통해 MCP 서버와 연동하여 이메일이나 캘린더 등 다양한 외부 도구를 제어할 수 있는 것이 특징입니다. 또한 온디바이스 음성 인식과 고화질 JPEG 전송을 지원하여 끊김 없는 핸즈프리 경험을 제공합니다.
본문
██╗ ██╗██╗███████╗██╗ ██████╗ ███╗ ██╗ ██║ ██║██║██╔════╝██║██╔═══██╗████╗ ██║ ██║ ██║██║███████╗██║██║ ██║██╔██╗ ██║ ╚██╗ ██╔╝██║╚════██║██║██║ ██║██║╚██╗██║ ╚████╔╝ ██║███████║██║╚██████╔╝██║ ╚████║ ╚═══╝ ╚═╝╚══════╝╚═╝ ╚═════╝ ╚═╝ ╚═══╝ ██████╗██╗ █████╗ ██╗ ██╗██████╗ ███████╗ ██╔════╝██║ ██╔══██╗██║ ██║██╔══██╗██╔════╝ ██║ ██║ ███████║██║ ██║██║ ██║█████╗ ██║ ██║ ██╔══██║██║ ██║██║ ██║██╔══╝ ╚██████╗███████╗██║ ██║╚██████╔╝██████╔╝███████╗ ╚═════╝╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝ ██████╗██╗ █████╗ ██╗ ██╗██████╗ ███████╗ ██╔════╝██║ ██╔══██╗██║ ██║██╔══██╗██╔════╝ ██║ ██║ ███████║██║ ██║██║ ██║█████╗ ██║ ██║ ██╔══██║██║ ██║██║ ██║██╔══╝ ╚██████╗███████╗██║ ██║╚██████╔╝██████╔╝███████╗ ╚═════╝╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝ Let Claude see the world through your eyes Built by @mrdulasolutions VisionClaude turns your iPhone or Meta Ray-Ban Smart Glasses into Claude's eyes and ears. Your phone connects directly to your Claude Code session — speak naturally, and Claude sees what you see, responds with voice, and uses ALL your MCP tools and skills. iPhone/Glasses ──→ Channel Plugin ──→ Claude Code (Opus) (camera+voice) (WebSocket) ALL your MCP tools ALL your skills Full Cowork session | Requirement | How to Get It | |---|---| | macOS 13+ | You probably have this | | Node.js 18+ | brew install node | | Bun | curl -fsSL https://bun.sh/install | bash | | Xcode 15+ | Mac App Store | | iPhone (iOS 17+) | Physical device, USB cable | | Claude Code CLI | npm install -g @anthropic-ai/claude-code | git clone https://github.com/mrdulasolutions/visionclaude.git cd visionclaude/ClaudeVision ./setup.sh The interactive installer handles dependencies, API keys, and Xcode project generation: Add VisionClaude as an MCP server in your project's .mcp.json : { "mcpServers": { "visionclaude": { "command": "bun", "args": ["run", "/path/to/visionclaude/ClaudeVision/channel/server.ts"] } } } Then launch Claude Code with the channel enabled: claude --dangerously-load-development-channels "server:visionclaude" Open the dashboard in your browser: http://localhost:18790 You'll see your Channel Token, your Mac's IP address, and copy buttons for both. The dashboard also lets you send messages to your phone, configure ElevenLabs TTS, and monitor activity. Open the VisionClaude app on your iPhone, go to Settings, and enter: | Setting | Value | |---|---| | Host | Your Mac's IP (shown on dashboard) | | Port | 18790 | | Channel Token | Copy from dashboard | Tap Connect. You should see a green status indicator — you're now talking directly to your Claude Code session. Point your camera at something and say "What am I looking at?" — Claude describes it. Say "Email this to my team" — Claude uses your email MCP tool. Every tool and skill in your Cowork session is available through voice. - iPhone camera — 1920x1080 (1080p) @ 30fps, continuous autofocus - Meta Ray-Ban glasses — 1280x720 (720p) @ 30fps via DAT SDK - High-performance CADisplayLink renderer (smooth video, not snapshots) - 85% JPEG quality for accurate text/brand/object identification - STT: Apple Speech Recognition (on-device, privacy-first) - TTS: ElevenLabs Flash v2.5 with 10 selectable voices, or Apple TTS fallback - Tap-to-interrupt: stop Claude mid-sentence - Bluetooth mic routing for hands-free glasses operation - Configurable from the web dashboard or iOS app settings - Retro terminal UI with live status monitoring - Auto-detects and displays your Mac's IP - One-click copy for token, IP, and all settings - Send messages to your phone from your Mac - Configure ElevenLabs API key directly - Activity log showing all inbound/outbound messages - Live client connection count - Shared secret token — auto-generated, required for all connections - Token stored at ~/.claude/channels/visionclaude/.channel-token (owner-only permissions) - Health endpoint is public; everything else requires auth - Override with VISIONCLAUDE_TOKEN=your-custom-token env var By default, Claude Code prompts for approval on every action from the phone. Choose your comfort level: Replies only (safest) — add to .claude/settings.local.json : { "permissions": { "allow": [ "mcp__visionclaude__reply", "mcp__visionclaude__edit_message", "Read(~/.claude/channels/visionclaude/**)" ] } } All VisionClaude tools (convenient): claude --dangerously-load-development-channels "server:visionclaude" \ --allowedTools "mcp__visionclaude__*" Full hands-free (use with care — skips ALL prompts): claude --dangerously-load-development-channels "server:visionclaude" \ -p bypassPermissions If you don't use Claude Code, the standalone gateway server works with just an Anthropic API key: cd ClaudeVision/server cp .env.example .env # Add your ANTHROPIC_API_KEY npm install && npm run build && npm start Gateway Mode auto-discovers MCP servers from your Claude Desktop config and skills from your local repos. It uses the Claude API directly instead of going through Claude Code. Edit ~/Library/Application Support/Claude/claude_de
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유