HN 표시: MacParakeet – Mac용 로컬 음성 받아쓰기 및 전사(GPL-3.0)
hackernews
|
|
📦 오픈소스
#ai 딜
#anthropic
#llama
#mac
#neural engine
#openai
#오픈소스
#음성 전사
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
Mac 사용자를 위한 로컬 음성 받아쓰기 및 전사 도구인 MacParakeet이 공개되었습니다. 이 도구는 데이터를 기기 내에서만 처리하여 보안을 강화하는 점이 특징이며, GPL-3.0 오픈소스 라이선스를 따르는 무료 소프트웨어입니다.
본문
Fast, local-first voice app for Mac. Free and open-source. There are many voice transcription/dictation apps, but this one is mine. MacParakeet runs NVIDIA's Parakeet TDT on Apple's Neural Engine via FluidAudio CoreML. Press a hotkey, speak, text appears. Or drag a file and get a full transcript. All speech recognition happens on your Mac. Dictation — Press a hotkey in any app, speak, text gets pasted. Hold for push-to-talk, double-tap for persistent recording. Works system-wide. File transcription — Drag audio or video files, or paste a YouTube URL. Full transcript with word-level timestamps, speaker labels, and export to 7 formats (TXT, Markdown, SRT, VTT, DOCX, PDF, JSON). Text cleanup — Filler word removal, custom word replacements, text snippets with triggers. Deterministic pipeline, no LLM needed. AI features — Optional transcript summarization and chat via your own API keys (OpenAI, Anthropic, Ollama, OpenRouter). Entirely opt-in. - ~155x realtime — 60 min of audio in ~23 seconds - ~2.5% word error rate (Parakeet TDT 0.6B-v3) - ~66 MB working memory during inference - 25 European languages with auto-detection - Apple Silicon only (M1/M2/M3/M4) - Best with English — supports 25 European languages but accuracy varies - No CJK language support (Korean, Japanese, Chinese, etc.) Download: Grab the notarized DMG or visit macparakeet.com. Drag to Applications, done. First launch downloads the speech model (~6 GB). After that, dictation and transcription work fully offline. Build from source: git clone https://github.com/moona3k/macparakeet.git cd macparakeet swift test # 976 tests scripts/dev/run_app.sh # build, sign, launch The dev script creates a signed .app bundle so macOS grants mic and accessibility permissions. Set DEVELOPMENT_TEAM=YOUR_TEAM_ID if needed. CLI: swift run macparakeet-cli transcribe /path/to/audio.mp3 swift run macparakeet-cli models status swift run macparakeet-cli history | Layer | Choice | |---|---| | STT | Parakeet TDT 0.6B-v3 via FluidAudio CoreML (Neural Engine) | | Language | Swift 6.0 + SwiftUI | | Database | SQLite via GRDB | | Auto-updates | Sparkle 2 | | YouTube | yt-dlp | | Platform | macOS 14.2+, Apple Silicon | All speech recognition runs on the Neural Engine. Your audio never leaves your Mac. - No cloud STT. The model runs on-device. No audio is transmitted. - No accounts. No login, no email, no registration. - Anonymous telemetry. Non-identifying usage analytics, opt-out in Settings. No persistent IDs, no IP storage, no content transmitted. Source code is right here — verify it yourself. - Temp files cleaned up. Audio deleted after transcription unless you save it. What does use the network: AI Summary & Chat connects to LLM providers when you configure it with your own API keys. YouTube transcription downloads video via yt-dlp. Telemetry pings our server unless you opt out. Core dictation and transcription are fully offline. Note: Builds from source also send telemetry by default. Opt out in Settings or set MACPARAKEET_TELEMETRY_URL to override. - Report bugs — Open an issue - Submit a PR — Fork, make changes, swift test , open a PR - Read the specs — Architecture decisions and feature specs live in spec/ For larger changes, open an issue first. GPL-3.0. Free software. Full license. Made for people who think faster than they type.
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유