Show HN: 모든 사람을 위한 브라우저 에이전트의 RL 교육을 위한 통합을 구축했습니다.

hackernews | 2026년 3월 26일 02:21 | 📦 오픈소스

#browseragent #browserautomation #browserbase #gpt-4 #openai #review #rl #showhn

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Verifiers의 BrowserEnv는 Browserbase와 통합하여 DOM 및 CUA(시각 기반) 두 가지 실행 모드를 지원하는 브라우저 자동화 도구입니다. DOM 모드는 구조화된 웹사이트에 적합하며, CUA 모드는 좌표 기반 제어가 유리한 복잡한 페이지에 최적화되어 있습니다. 사용자는 필수 API 키를 프로세스 환경 변수로 설정하고, 로컬 또는 Environments Hub를 통해 시크릿을 관리해야 합니다.

본문

BrowserEnv is Verifiers' Browserbase integration for browser automation tasks. It supports two execution modes: - DOM mode ( mode="dom" ): natural-language actions through Stagehand (act ,observe ,extract ) - CUA mode ( mode="cua" ): vision-based coordinate actions (click ,type_text ,scroll ,screenshot ) Use this integration when your environment needs real browser interaction during rollout. From the verifiers repo (or a project using verifiers ): uv sync --extra browser Or with pip/uv pip: uv pip install -e ".[browser]" When you publish an environment that uses BrowserEnv , list verifiers[browser] in that package’s pyproject.toml dependencies so installs from the Environments Hub pull the extra. Validate required variables early in load_environment() with vf.ensure_keys([...]) (see Required API Keys in the Verifiers environments guide). BrowserEnv reads credentials from process environment variables. Defaults: | Variable | Required | Notes | |---|---|---| BROWSERBASE_API_KEY | Yes | Browserbase API key | BROWSERBASE_PROJECT_ID | Yes | Browserbase project id | MODEL_API_KEY | DOM only | Stagehand’s LLM, unless proxy_model_to_stagehand=True (then the rollout client supplies the key) | OPENAI_API_KEY (or your judge provider) | If you use JudgeRubric / LLM judges | Not used by BrowserEnv itself | Override names with browserbase_api_key_var and model_api_key_var if needed. For an LLM judge, set the provider’s API key env (often OPENAI_API_KEY ) in the same places as the other variables—locally or on the environment—see Browser environments for judge-oriented examples. Shell exports for local runs (e.g. prime eval run ): export BROWSERBASE_API_KEY="your-api-key" export BROWSERBASE_PROJECT_ID="your-project-id" For DOM mode (default Stagehand routing), also: export MODEL_API_KEY="your-model-key" On the Environments Hub, open your environment. On the Secrets tab, add direct secrets or link global secrets from Keys & Secrets. Variables (same tab) is for non-sensitive configuration only (see Environment variables); API keys belong in Secrets. Hub secret and variable names must start with an uppercase letter and use only uppercase letters, digits, and underscores (Secrets); the defaults above already satisfy this. If the same name appears as a variable, a linked global secret, and a direct secret, precedence is: variable (lowest), then linked global secret, then direct secret (highest). Secrets and variables are injected automatically for Environment Actions, hosted evaluations, and hosted training (Secrets)—do not pass secret values through load_environment arguments (env_args / -a / TOML env_args ); use those only for non-secret options (e.g. mode , num_examples ). For hosted CLI runs, prime eval run ... --hosted --custom-secrets '{"NAME":"value"}' is for extra per-run secrets only; routine keys should live on the environment in the Hub (Hosted evaluations). CLI examples: prime env secret create owner/my-env --name BROWSERBASE_API_KEY --value "..." prime env secret link owner/my-env prime env var create owner/my-env --name MAX_TURNS --value 10 - Browser environments — DOM/CUA workflows, judges, and training-oriented notes - Environments Hub getting started import verifiers as vf from datasets import Dataset from verifiers.envs.integrations.browser_env import BrowserEnv def load_environment() -> vf.Environment: dataset = Dataset.from_list( [ { "prompt": [ { "role": "user", "content": "Go to https://example.com and tell me the page title.", } ] } ] ) async def scored(completion) -> float: return 1.0 if "example domain" in completion[-1]["content"].lower() else 0.0 rubric = vf.Rubric(funcs=[scored]) return BrowserEnv( mode="dom", # switch to "cua" for vision-based interaction dataset=dataset, rubric=rubric, max_turns=10, ) Use DOM mode for structured websites where semantic element access is effective. Common args: mode="dom" model_api_key_var (default:"MODEL_API_KEY" )stagehand_model (default:"openai/gpt-4o-mini" )proxy_model_to_stagehand (default:False ) Use CUA mode for visually complex pages where coordinate-based control works better. Common args: mode="cua" use_sandbox=True (default; auto-deploys CUA server)use_prebuilt_image=True (default; fastest startup)server_url (used whenuse_sandbox=False )viewport_width /viewport_height CUA execution options: - Prebuilt image (default): fastest startup - Binary upload ( use_prebuilt_image=False ): custom server workflows - Manual local server ( use_sandbox=False ): local development/debugging For complete reference implementations, see: - DOM example: environments/browser_dom_example/ environments/browser_dom_example/browser_dom_example.py environments/browser_dom_example/README.md - CUA example: environments/browser_cua_example/ environments/browser_cua_example/browser_cua_example.py environments/browser_cua_example/README.md These examples show end-to-end load_environment() setup, evaluation commands, and recommended runtime flags.

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기