Show HN: Out Loud – open-source desktop TTS app for macOS/Windows/Linux
hackernews
|
|
📰 뉴스
#ai 딜
#llama
#openai
#tts
#오픈소스
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
'Out Loud'는 macOS, Windows, Linux를 지원하는 오픈소스 데스크톱 텍스트 음성 변환(TTS) 앱으로, 고성능 모델인 Kokoro-82M을 기반으로 작동합니다. 8개 언어의 50여 가지 자연스러운 목소리를 제공하며 100% 오프라인으로 실행되어 사용자 데이터가 외부로 유출되지 않습니다. 또한 메뉴 바 통합, 웹 브라우저 확장 프로그램 지원 및 로컬 HTTP API를 통해 다양한 환경에서 텍스트를 읽어줄 수 있습니다.
본문
Free, open-source, 100% offline AI text-to-speech. A native desktop app for macOS, Windows, and Linux that reads text aloud with natural-sounding voices. Everything runs locally on your machine. No cloud, no accounts, no telemetry. - What it does - Screenshots - Install - Supported languages - How it works - API - Repository layout - Build from source - Scripts - Documentation - Contributing - License - Credits - Reads text aloud with 50+ natural voices across 8 languages - Runs 100% offline. Nothing leaves your computer - Built on Kokoro-82M, an open-weight TTS model ranked highly on the TTS Arena - Desktop app with menu-bar / system-tray integration - Chrome & Safari extensions to read any webpage in one click - Local HTTP API on port 51730 for extensions and scripts | Desktop app. Paste, pick a voice, hit play. | System tray. Menu-bar access while you work. | Grab the latest release for your platform from the Releases page: - macOS: Out Loud-.dmg (Apple Silicon & Intel) - Windows: Out Loud-.exe - Linux: Out Loud-.AppImage Or build it yourself. See Build from source. English (US & UK), Japanese, Chinese (Mandarin), Spanish, Brazilian Portuguese, Italian, and Hindi. 50+ voices total. The Electron app owns the ONNX model, runs TTS in a worker thread, and exposes a local HTTP API on port 51730 that the browser extensions talk to. Nothing goes to the network. flowchart LR subgraph Desktop["Out Loud desktop app (Electron)"] UI[Renderer UIReact + Vite] Main[Main process] Worker[TTS workerONNX + espeak-ng] UI Main Main Worker end HTTP[["HTTP :51730"]] Main --- HTTP Chrome[Chrome extension] Safari[Safari extension] Scripts[Your scripts / curl] Chrome --> HTTP Safari --> HTTP Scripts --> HTTP The Electron app exposes a local HTTP API on 127.0.0.1:51730 . A few quick examples: # List voices curl http://127.0.0.1:51730/api/v1/audio/voices # Generate audio to a file curl -X POST http://127.0.0.1:51730/api/v1/audio/speech \ -H "Content-Type: application/json" \ -d '{"voice":"af_heart","input":"Hello, world."}' \ --output hello.wav Full reference: docs/app/api.md . Machine-readable OpenAPI 3.1 spec: docs/app/openapi.yaml . Also served live at http://127.0.0.1:51730/api/v1/openapi.yaml while the app is running. . ├── electron/ TTS engine + main process (TypeScript) ├── electron-ui/ Renderer UI (React + Vite + Tailwind) ├── chrome-extension/ Chrome / Chromium extension ├── safari-extension/ Safari extension (generated from chrome-extension) ├── tray-app/ Standalone menu-bar app variant ├── scripts/ Icon generation, dev server helpers ├── build-resources/ Icons, entitlements, DMG background for electron-builder └── docs/ Deeper docs (app, extensions, build) - Node.js >= 22 (LTS) - npm >= 10 - macOS, Windows, or Linux git clone https://github.com/light-cloud-com/out-loud.git cd out-loud npm install npm run electron-ui:install # Electron app + hot-reloaded UI npm run electron:dev # Compile only the Electron main process npm run electron:compile # Run the UI in a browser (no Electron) for faster iteration npm run electron:browser npm run electron:build # current platform npm run electron:build:mac # macOS .dmg npm run electron:build:win # Windows .exe npm run electron:build:linux # Linux .AppImage Output lands in releases// . See Output layout. npm run extension:chrome:pack # zip the Chrome extension npm run extension:safari:convert # convert to Safari (macOS, Xcode required) npm run extension:test # smoke-test the local HTTP API All build artifacts land under releases/ (gitignored): releases/ ├── macos/ *.dmg, *-mac.zip ├── windows/ *-Setup.exe ├── linux/ *.AppImage, *.deb └── extensions/ ├── out-loud-chrome.zip └── safari/ Xcode project (generated) For automated releases via GitHub Actions, see docs/build/releasing.md . | Script | What it does | |---|---| npm run check | Lint + format check + knip + typecheck | npm run lint | ESLint (use lint:fix to auto-fix) | npm run fmt | Format with Prettier | npm run knip | Check for unused files, deps, and exports | npm test | Run Vitest unit tests | npm run electron:dev | Electron + UI dev server | npm run electron:build | Package for current platform | Deeper docs live in docs/ : - App: docs/app/architecture.md ,docs/app/api.md ,docs/app/voices.md ,docs/app/openapi.yaml - Extensions: docs/extensions/testing.md ,chrome-extension/README.md ,safari-extension/README.md - Build: docs/build/releasing.md ,docs/build/mac-app-store.md - Contributing: CONTRIBUTING.md Issues, pull requests, translations, and new voices are all welcome. See CONTRIBUTING.md to get started. MIT. See LICENSE. Bundled third-party components (Kokoro-82M, espeak-ng, onnxruntime-node, Electron, ffmpeg, fonts) retain their own licenses. See THIRD_PARTY_NOTICES.md for the full list. - Kokoro-82M, Apache 2.0: TTS model - espeak-ng: phonemization - onnxruntime-node: on-device inference Built by Light Cloud Labs and contributors.
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유