HN 표시: 생성적 미디어 기술

hackernews | 2026년 3월 22일 12:06 | 📦 오픈소스

#ai 딜 #claude #claude code #gemini #openai #단일 api #미디어 기술 #생성형 ai #에이전트

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

each::labs는 431개 이상의 AI 모델을 통합하여 이미지, 비디오, 오디오 등 562개의 생성형 미디어 스킬을 제공하는 단일 API 플랫폼을 출시했습니다. 사용자는 클로드 코드 플러그인을 설치하거나 개별 스킬을 추가하여 로고 디자인, 배경 제거, 얼굴 교체, 곡 생성 등 다양한 작업을 수행할 수 있으며, OpenAI 호환 엔드포인트를 통해 최적의 모델을 자동 선택할 수 있습니다. 이번 서비스는 마케팅, 게임, 패션 등 15개 이상의 도메인별 맞춤형 스킬을 지원하여 개발자들이 복잡한 파이프라인 없이도 고품질의 미디어 콘텐츠를 생성할 수 있도록 돕습니다.

본문

562 generative media skills for AI agents, powered by each::labs. Generate images, videos, audio, 3D models, and more using 431 AI models through a single API. Install as a Claude Code plugin and get instant access to every skill. # Claude Code plugin /plugin install awesome-genmedia # Or add individual skills npx skills add awesome-genmedia/skills@image-generation npx skills add awesome-genmedia/skills@flux-2-max npx skills add awesome-genmedia/skills@logo-design - Sign up at eachlabs.ai - Get your API key from Settings - Set environment variable: export EACHLABS_API_KEY="your-api-key" curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \ -H "Content-Type: application/json" \ -H "X-API-Key: $EACHLABS_API_KEY" \ -d '{ "messages": [{"role": "user", "content": "Generate a professional headshot, studio lighting, neutral background"}], "stream": false }' Core generative media capabilities at the root level: | Skill | Description | |---|---| | image-generation | Generate images from text | | image-editing | Edit images with natural language | | image-upscaling | Enhance image resolution | | background-removal | Remove image backgrounds | | face-swap | Swap faces between photos | | video-generation | Generate videos from text or images | | video-editing | Edit videos with AI | | song-generation | Generate songs with vocals | | music-generation | Generate instrumental music | | lyrics-generation | Generate song lyrics | | voice-generation | Generate human-like voice audio | | text-to-speech | Convert text to speech | | speech-to-text | Transcribe audio to text | | sound-effects | Generate custom sound effects | Use-case specific skills organized by domain: | Domain | Skills | Examples | |---|---|---| | Image | 15 | Headshots, avatars, QR codes, patterns, tattoos | | Video | 10 | Text/image-to-video, music videos, trailers, loops | | Audio | 6 | TTS, music, sound effects, voiceover, jingles | | Design | 14 | Logos, thumbnails, posters, business cards, packaging | | Face & Portrait | 7 | Face swap, aging, beauty, caricature, makeup | | Social Media | 6 | Instagram, TikTok, Twitter, LinkedIn, Pinterest, YouTube | | E-commerce | 5 | Product photos, mockups, lifestyle, video ads | | Marketing | 5 | Ad creatives, brand kits, landing pages, campaigns | | Gaming | 6 | Game assets, characters, environments, sprites, UI | | Fashion | 5 | Fashion models, outfits, try-on, fabric patterns | | Real Estate | 5 | Virtual staging, interior design, floor plans | | Photography | 5 | Restoration, colorization, stock photos, HDR | | 3D & AR | 4 | 3D models, textures, image-to-3D, AR filters | | NFT & Art | 4 | NFT collections, pixel art, generative art | | Education | 4 | Diagrams, flashcards, educational videos | | Architecture | 3 | Building visualization, landscape, renders | | Food & Beverage | 3 | Food photography, recipe visuals, menus | | Automotive | 3 | Car configurator, vehicle wraps, auto ads | | NSFW | 2 | Adult image and video generation | | Workflows | 5 | Multi-model pipelines and batch processing | Every AI model available on each::labs has its own skill under models/ : flux-2-max · flux-2-pro · flux-2 · flux-kontext-pro · flux-kontext-max · nano-banana-pro · nano-banana-2-text-to-image · gemini-3-pro-image-preview · imagen-4-fast · imagen4-preview · bytedance-seedream-v4-5-text-to-image · bytedance-seedream-v5-lite-text-to-image · kling-v3-text-to-image · gpt-image-v1-5-text-to-image · xai-grok-imagine-text-to-image · reve-text-to-image · ideogram-v3-turbo · stable-diffusion-3-5-large · and more... veo-3 · veo3-1-text-to-video · veo3-1-text-to-video-fast · kling-o3-pro-text-to-video · kling-v3-pro-text-to-video · sora-2-text-to-video-pro · pixverse-v5-6-text-to-video · wan-v2-6-text-to-video · runway-gen4-aleph · pika-v2-2-text-to-video · seedance-v1-5-pro-text-to-video · minimax-hailuo-v2-3-pro-text-to-video · and more... flux-2-edit · flux-2-max-edit · flux-fill-pro · eachlabs-bg-remover-v1 · topaz-upscale-image · kling-face-swap · nano-banana-pro-edit · qwen-ai-image-edit · firered-image-edit-v1-1 · and more... elevenlabs-text-to-speech · mureka-generate-song · mureka-generate-instrumental · mureka-generate-lyrics · stable-audio-2-5-text-to-audio · xai-grok-tts-text-to-speech · google-text-to-speech · deepgram-nova-3-speech-to-text · whisper · and more... topaz-upscale-video · auto-subtitle · heygen-video-translate · pixverse-lip-sync · merge-videos · ffmpeg-api-merge-audio-video · and more... bytedance-omnihuman-v1-5 · bytedance-dreamactor-v2 · kling-avatar-v2-pro · sync-lipsync-v2-pro · infinitalk-image-to-video · and more... OpenAI-compatible endpoint that auto-selects the best model: from openai import OpenAI client = OpenAI( api_key="YOUR_EACHLABS_API_KEY", base_url="https://eachsense-agent.core.eachlabs.run/v1" ) response = client.chat.completions.create( model="eachsense/beta", messages=[{"role": "user", "content": "Generate a logo for a coffee brand"}] ) Direct model access: curl -X P

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기