자율 AI 에이전트로 내 오픈 소스 저장소 관리

hackernews | | 💼 비즈니스
#ai 에이전트 #gpt-4 #openai #오픈소스 #자동화 #꿀팁 #ai #claude #claude code #tip #생산성 #자율에이전트
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

한 명의 개발자가 10여 개의 오픈소스 프로젝트를 효율적으로 유지보수하기 위해 'prodboard'라는 자체 자동화 도구를 구축하고, 기존 월 100달러 짜리 클로드(Claude) 구독권으로 5개의 자율 AI 코딩 에이전트를 백그라운드에서 작동시키는 시스템을 구축했습니다. 이 시스템은 매 시간 또는 분 단위로 깃허브 이슈 처리, 코드 구현, 5가지 관점의 엄격한 코드 리뷰, 프로젝트 정리 작업을 자동으로 수행하며 최종 머지는 개발자가 직접 승인하도록 설계되었습니다. 가동 첫 24시간 동안 이 에이전트들은 SQL 인젝션이나 권한 우회 등의 실제 보안 버그들을 정확히 찾아냈고, 30개 이상의 풀 리퀘스트(PR)를 열어 자동화된 버전 관리 및 테스트 추가 등의 실질적인 성과를 입증했습니다.

본문

Managing My Open Source Repos with Autonomous AI Agents Just found a way to squeeze even more out of my Claude Code subscription, and this trick also works with other coding subscriptions. The idea: make your AI coding agent work while you sleep. The Problem I maintain about 10 open source projects, mostly Cloudflare Workers tools like workers-qb, R2-Explorer, and workers-research. Keeping up with each one of them takes time, and I’m just talking about basic maintenance. Forget about me adding new features. There are always issues piling up, dependencies to update, tests to add, and documentation to improve. Yes, I can open Claude and guide it through the process of fixing something, then reviewing it, then committing. But honestly, that is the last thing I want to do after spending a complete day already talking to Claude at work. A coding agent shouldn’t need me to be present for routine maintenance. Here’s the gap that bothered me most: Claude is great at finding review issues, but for some reason I need to explicitly tell it “hey, review this PR you just made.” Only then it will catch problems. That manual loop (implement, then remind it to review, then fix, then remind it again) is what I wanted to eliminate. The Approach Instead of waiting for some perfect platform to solve this, I built my own lightweight automation. The key insight: you don’t need a complex platform, just a task queue, a scheduler, and an AI coding agent that can use git and GitHub CLI. The whole thing works very simply: a kanban board and a schedule with a prompt. That’s it. I built a small tool called prodboard to glue it together. It’s a CLI-first issue tracker with a cron scheduler backed by SQLite. But you could do the same with GitHub Actions, a shell script, or anything that can run commands on a schedule. Every schedule spins up a sub-agent, a new Claude Code instance in a tmux terminal. I set up the entire pipeline in a single Claude Code conversation. All 5 agents, the issue templates, even a fix to the scheduler’s systemd service. From zero to fully autonomous in one sitting. Cost: nothing extra. The whole thing runs on my existing $100/month Claude Max subscription. I don’t pay anything above what I’m already paying. If you need Claude during the day for your actual work, you could even configure the cron jobs to run mostly at night. The Architecture The system runs as a systemd user service on a Linux machine. Five agents on cron schedules: GitHub Open Source Contributor (hourly) Picks a random repo with 20+ stars from my GitHub profile, checks for open issues or scans for TODOs and FIXMEs in the code, and implements a focused fix. It verifies no one else already has an open PR for the same work. Every PR must include tests. No tests, no PR. After pushing, it creates a review ticket on the board. PR Code Review Agent (every 15 min) Picks the next PR waiting for review and gates on CI. It won’t review if checks are still failing or pending. The review runs 5 independent perspectives: Correctness, Security, Performance, Code Quality, and Testing. At least 4 out of 5 must approve with no major or medium issues for the PR to pass. When it requests changes, it leaves detailed line-level comments. Issue Worker (every 10 min) Picks the next todo issue from the queue and validates it has enough information: a repo URL, a problem statement, affected code references. If anything is missing, it flags the issue for human attention with a specific comment about what’s needed. If the info is sufficient, it clones the repo, reads the codebase, implements the fix, and opens a PR. Daily Summary (daily at 9 AM) Generates a report covering issues solved, PRs opened, cost, and token usage. It creates this as a “done” issue so it doesn’t get picked up by other agents. House Cleaning (hourly) Checks all issues in human-approval status: merged PRs get moved to done, conflicting PRs get sent back to todo for a rebase. Keeps the board clean automatically. The Issue Lifecycle todo -> agent implements -> review -> code review agent checks ^ | |-- sent back (needs fixes) human merges -> house cleaning -> done The human is always the final gatekeeper. Nothing gets merged without me clicking the merge button. I’ve also made a human-approval status for sensitive changes or when Claude actually needs me to enter a credential somewhere. The agents propose, review, and clean up, but I make the final call. Real Results In the first 24 hours, the agents found real bugs across my repos: - SQL injection in query parameter handling in workers-qb - Authorization bypass letting non-admin users access other mailboxes in email-explorer - Content-Disposition header injection in R2-Explorer - Quarter date truncation returning NULL for months 5-12 in django-cf Beyond bug fixes, the agents set up automated versioning with changesets and npm trusted publishing across about 10 repos in one batch. They opened 30+ PRs, each reviewed by the 5-perspective code review agen

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →