Show HN: REST API for Gymnasium (fka OpenAI Gym) reinforcement learning library

hackernews | 2026년 4월 6일 14:24 | 📰 뉴스

#gymnasium #openai #openai gym #reinforcement learni #rest api #review #show hn #강화학습 #언어 독립적

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Gymnasium(구 OpenAI Gym) 라이브러리를 위한 REST 인터페이스가 공개되어, 언어에 구애받지 않고 강화 학습 에이전트를 개발할 수 있게 되었습니다. 사용자는 도커(Docker) 이미지를 실행하거나 파이썬 서버를 구동한 후 'CartPole-v1'과 같은 환경을 생성하고 REST API 엔드포인트를 호출하여 에이전트의 상태 초기화 및 행동 제어를 진행할 수 있습니다. 또한 브라우저 기반의 자바스크립트 에이전트 테스트와 활성 환경 모니터링 페이지를 제공하며, API 목록 조회, 환경 렌더링 등 다양한 기능을 지원합니다.

본문

REST interface for the Gymnasium library to allow for language-agnostic development of reinforcement learning agents. Spiritual successor to openai/gym-http-api. The fastest way to get started is to pull and run the latest version of the associated container image from GHCR: docker run --rm -it -p 5000:5000 ghcr.io/cloudkj/gymnasium-http-api Alternatively, you can clone the code and run the server directly: python3 -m uvicorn app.main:app --port 5000 --host 0.0.0.0 Once the server is running, create and interact with environments through the various endpoints. Here's a command-line example of creating an environment, resetting its state, then taking one action. % curl -X POST http://localhost:5000/v1/envs/ -H 'Content-Type: application/json' -d '{"env_id":"CartPole-v1"}' {"instance_id":"986a40f7-6472-4ac6-bc5f-86b7939898b1"} % curl -X POST http://localhost:5000/v1/envs/986a40f7-6472-4ac6-bc5f-86b7939898b1/reset/ {"observation":[-0.001753740361891687,0.0477466844022274,0.003655971959233284,-0.030443252995610237],"info":{}} % curl -X POST http://localhost:5000/v1/envs/986a40f7-6472-4ac6-bc5f-86b7939898b1/step/ -H 'Content-Type: application/json' -d '{"action": 0}' {"observation":[-0.0037473568227142096,-0.3425928056240082,0.008314925245940685,0.557033360004425],"reward":1.0,"terminated":false,"truncated":false,"info":{}} To see an agent in action, check out the standalone, client-side Javascript agent available at /agent.html which shows a typical agent loop over a single episode. You can also modify the agent code directly in your browser and try out different heuristics or policies. To observe the state of all active environments on the server, check out the monitoring page available at /monitor.html . To start developing an agent, simply call endpoints to create and interact with the environment of your choice. Here's a simple Python example that wraps the main endpoints as a drop-in replacement for gymnasium.Env : import requests class Env: BASE_URL = "http://localhost:5000/v1/envs" def __init__(self, env_id): self.instance_id = requests.post(f"{self.BASE_URL}/", json={"env_id": env_id}).json()["instance_id"] def reset(self): return requests.post(f"{self.BASE_URL}/{ self.instance_id}/reset/", json={}).json() def step(self, action): return requests.post(f"{self.BASE_URL}/{self.instance_id}/step/", json={"action": action}).json() def close(self): requests.delete(f"{self.BASE_URL}/{self.instance_id}/") # Create an environment and reset state to start a new episode env = Env("CartPole-v1") initial = env.reset() observation, info = initial["observation"], initial["info"] print(f"Starting observation: {observation}") episode_over = False total_reward = 0 while not episode_over: action = 0 # To the left, to the left # Take the action and see what happens state = env.step(action) total_reward += state["reward"] episode_over = state["terminated"] or state["truncated"] print(f"Episode finished! Total reward: {total_reward}") env.close() The endpoints largely follow the conventions established by the legacy gym-http-api project. For the latest documentation, start an instance of the server and navigate to /docs to view auto-generated documentation for all supported endpoints. GET /v1/envs/ - List all active environment instancesPOST /v1/envs/ - Create an instance of the specified environmentPOST /v1/envs/{instance_id}/reset/ - Reset the environmentPOST /v1/envs/{instance_id}/step/ - Step through the environment with a specified actionGET /v1/envs/{instance_id}/action_space/ - Get action space propertiesGET /v1/envs/{instance_id}/observation_space/ - Get observation space propertiesDELETE /v1/envs/{instance_id}/ - Close and remove the environment GET /v1/envs/{instance_id}/monitor/render/ - Render the current state of the environmentGET /v1/envs/{instance_id}/monitor/stream/ - Continuously stream the current state of the environment

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기