Show HN: Gemini는 이제 기본적으로 비디오를 삽입할 수 있으므로 1초 미만의 비디오 검색 기능을 구축했습니다.
hackernews
|
|
📦 오픈소스
#gemini
#동영상 처리
#비디오 검색
#시맨틱 검색
#하드웨어/반도체
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
'SentrySearch'는 구글의 Gemini 임베딩 모델을 활용해 대시캠 영상을 청크 단위로 벡터화한 뒤, 텍스트 질의만으로 원하는 장면을 검색하고 클리핑하는 도구입니다. 별도의 자막이나 전사 과정 없이 원본 비디오 픽셀을 텍스트와 동일한 벡터 공간에 직접 매핑하여 1초 미만의 시맨틱 검색을 구현했으며, 1시간 분량의 영상 인덱싱 비용은 약 2.5달러로 책정됩니다. 또한 중복되는 정지 화면 건너뛰기 등의 비용 최적화 기능을 기본 지원하고, 테슬라 Sentry Mode 파일뿐만 아니라 모든 MP4 형식의 영상 처리를 지원합니다.
본문
Semantic search over video footage. Type what you're looking for, get a trimmed clip back. demo.mp4 SentrySearch splits your mp4 videos into overlapping chunks, embeds each chunk as video using either Google's Gemini Embedding API or a local Qwen3-VL model, and stores the vectors in a local ChromaDB database. When you search, your text query is embedded into the same vector space and matched against the stored video embeddings. The top match is automatically trimmed from the original file and saved as a clip. - Install uv (if you don't have it): curl -LsSf https://astral.sh/uv/install.sh | sh # macOS/Linux powershell -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows - Clone and install: git clone https://github.com/ssrajadh/sentrysearch.git cd sentrysearch uv tool install . - Set up your API key (or use a local model instead): sentrysearch init This prompts for your Gemini API key, writes it to .env , and validates it with a test embedding. - Index your footage: sentrysearch index /path/to/footage - Search: sentrysearch search "red truck running a stop sign" ffmpeg is required for video chunking and trimming. If you don't have it system-wide, the bundled imageio-ffmpeg is used automatically. Manual setup: If you prefer not to use sentrysearch init , you can copy.env.example to.env and add your key from aistudio.google.com/apikey manually. $ sentrysearch init Enter your Gemini API key (get one at https://aistudio.google.com/apikey): **** Validating API key... Setup complete. You're ready to go — run `sentrysearch index ` to get started. If a key is already configured, you'll be asked whether to overwrite it. Tip: Set a spending limit at aistudio.google.com/billing to prevent accidental overspending. $ sentrysearch index /path/to/video/footage Indexing file 1/3: front_2024-01-15_14-30.mp4 [chunk 1/4] Indexing file 1/3: front_2024-01-15_14-30.mp4 [chunk 2/4] ... Indexed 12 new chunks from 3 files. Total: 12 chunks from 3 files. Options: --chunk-duration 30 — seconds per chunk--overlap 5 — overlap between chunks--no-preprocess — skip downscaling/frame rate reduction (send raw chunks)--target-resolution 480 — target height in pixels for preprocessing--target-fps 5 — target frame rate for preprocessing--no-skip-still — embed all chunks, even ones with no visual change--backend local — use a local model instead of Gemini (details below) $ sentrysearch search "red truck running a stop sign" #1 [0.87] front_2024-01-15_14-30.mp4 @ 02:15-02:45 #2 [0.74] left_2024-01-15_14-30.mp4 @ 02:10-02:40 #3 [0.61] front_2024-01-20_09-15.mp4 @ 00:30-01:00 Saved clip: ./match_front_2024-01-15_14-30_02m15s-02m45s.mp4 If the best result's similarity score is below the confidence threshold (default 0.41), you'll be prompted before trimming: No confident match found (best score: 0.28). Show results anyway? [y/N]: With --no-trim , low-confidence results are shown with a note instead of a prompt. Options: --results N , --output-dir DIR , --no-trim to skip auto-trimming, --threshold 0.5 to adjust the confidence cutoff, --save-top N to save the top N clips instead of just the best match. Backend and model are auto-detected from the index — pass --backend or --model only to override. Index and search using a local Qwen3-VL-Embedding model instead of the Gemini API. Free, private, and runs entirely on your machine. For the best search quality, use the Gemini backend — the local 8B model is a solid alternative when you need offline/private search, and the 2B model is a fallback when hardware can't support 8B. The model is auto-detected from your hardware — qwen8b for NVIDIA GPUs and Macs with 24 GB+ RAM, qwen2b for smaller Macs and CPU-only systems. You can override with --model qwen2b or --model qwen8b . Pick an install based on your hardware: | Hardware | Install command | Auto-detected model | Notes | |---|---|---|---| | Apple Silicon, 24 GB+ RAM | uv tool install ".[local]" | qwen8b | Full float16 via MPS | | Apple Silicon, 16 GB RAM | uv tool install ".[local]" | qwen2b | 8B won't fit; 2B uses ~6 GB | | Apple Silicon, 8 GB RAM | uv tool install ".[local]" | qwen2b | Tight — may swap under load; Gemini API recommended instead | | NVIDIA, 18 GB+ VRAM | uv tool install ".[local]" | qwen8b | Full bf16 precision | | NVIDIA, 8–16 GB VRAM | uv tool install ".[local-quantized]" | qwen8b | 4-bit quantization (~6–8 GB) | Won't work well: Intel Macs and machines without a dedicated GPU. These fall back to CPU with float32 — too slow and memory-hungry for practical use. Use the Gemini API backend (the default) instead. Not sure? On Mac, use ".[local]" . On NVIDIA, use".[local-quantized]" — 4-bit quantization works on the widest range of NVIDIA hardware with minimal quality loss. (bitsandbytes requires CUDA and does not work on Mac/MPS.) Mac prerequisite: Install system FFmpeg (the local model's video processor requires it — the Gemini backend uses a bundled ffmpeg instead): brew install ffmpeg Index with --backend local and search — no extra
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유