HN 표시: 처음부터 SQL 분석 에이전트 구축

hackernews | | 📰 뉴스
#gpt-4 #openai #오픈소스
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

이 글에서는 평문 질문을 입력받아 데이터베이스 스키마를 확인하고 SQL을 작성한 뒤, 결과를 설명하는 AI 에이전트 구축 방법을 소개합니다. Vercel AI SDK와 OpenAI를 활용해 모델이 단계적으로 스키마를 검토하고 오류를 수정할 수 있도록 설계했으며, 연습용 데이터베이스로 Chinook을 사용합니다. 이 예시는 에이전트 방식이 단일 프롬프트보다 데이터 기반의 정확한 답변을 생성하는 데 효과적임을 보여줍니다.

본문

In this article, I’m gonna build a small but actually useful AI agent: a SQL analyst that takes a plain English question, checks the database schema, writes SQL, runs it, and explains the result in normal language. I like this example because it clearly shows why agents are better than one-shot prompts for this kind of task. Instead of guessing table names and columns from the start, the model can inspect the schema step by step, recover from mistakes, and build the answer from actual data. I’m building this agent with Node.js. For the agent loop and tool definitions, I’m using the Vercel AI SDK together with OpenAI. For the model, I’m using gpt-4.1-mini . It’s cheap, fast, and more than enough for a project like this. Of course, we also need a database the agent can query. For that, I’m using the Chinook database, which is a very popular sample dataset for SQL demos and tutorials. Chinook is a sample database available for SQL Server, Oracle, MySQL, etc. It can be created by running a single SQL script. Chinook database is an alternative to the Northwind database, being ideal for demos and testing ORM tools targeting single and multiple database servers. The Chinook data model represents a digital media store, including tables for artists, albums, media tracks, invoices and customers. For this example, I’m using SQLite. It’s serverless, embedded, and perfect for a small demo like this because you don’t need to run a separate database server. To work with it in Node.js, I’m using better-sqlite3 . It’s simple, fast, and honestly one of the easiest ways to work with SQLite in a Node project. After seeding the database, you should be able to inspect the data and make sure everything loaded correctly. Below is an interactive view of the Chinook tables. You can switch between the tabs and explore the columns and rows inside each table. Now let’s talk about the agent part. I’m using the Vercel AI SDK to define tools and let the model decide when to call them. OpenAI handles the language side of things: understanding the user request, choosing the right tool, and turning the final query result into a human-readable answer. This is the important part: the model is not just making up an answer from memory. It can inspect the database first, understand the schema, write a query, run it, and then answer based on real output. That makes the whole thing way more reliable and much easier to debug. For this agent, we only need three tools: listTables to discover which tables exist.describeTable to inspect columns and relationships.runSql to execute the final read-only query. I split the tools this way on purpose. It pushes the model into a safer workflow: first explore, then write SQL. In practice, this helps a lot with reducing hallucinated table names, wrong column names, and broken queries. Once the tools are ready, the main function is actually very small. All it really does is pass the user question into generateText together with the model, the system prompt, the tools, and a stopping condition. The interesting part is not the amount of code here. The important part is the flow it creates. The model can inspect the schema, decide which tool to call next, retry if a query fails, and stop once it has enough information to answer properly. These are the last two pieces: the database connection and the system prompt. The database module keeps SQLite access in one place so all tools use the same connection. The system prompt tells the agent how to behave: inspect the schema first, only run SELECT queries, and then turn the result into a clean answer for the user. Even in a simple project like this, a good system prompt makes a huge difference. Now everything is ready, and here is the live version of the SQL analyst agent. You can ask different questions about the Chinook dataset and see how the agent behaves. It’s a nice way to watch the full loop in action: understand the question, inspect the schema, generate SQL, run it, and summarize the result. Ask questions about the Chinook music database in plain English If you want to try the full project locally, you can download the complete example from GitHub: Download Full Example. That version includes the seed data, the tool definitions, the system prompt, and the small UI used in this post, so you can run it, tweak it, and extend it with your own queries or even a different dataset. This project is intentionally small, but it shows the core idea behind a lot of useful AI agents. The value does not come from building a huge system or some overly complicated orchestration layer. It comes from giving the model a clear job, a small set of useful tools, and a workflow that lets it inspect real data before answering. A SQL analyst agent is a really good starting point because the feedback loop is tight. The model checks the schema, writes a query, sees the result, and adjusts if needed. Once that pattern is working, you can take it much further by adding query safety checks, supporting more databases, streaming steps to the UI, or even generating charts and reports. If you’re learning how to build agents, this is exactly the kind of example worth building yourself. It’s simple enough to understand from start to finish, but still realistic enough to teach the habits that actually matter in bigger systems.

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →