HN 표시: Rust에서 LSM 스토리지 엔진을 처음부터 구축했습니다.
hackernews
|
|
🔬 연구
#bigquery
#claude
#lsm
#review
#rust
#스토리지 엔진
#프로젝트
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
이 프로젝트는 재미 목적으로 러스트(Rust) 언어를 사용해 LSM(Log-Structured Merge) 스토리지 엔진을 처음부터 직접 구축한 개인적인 개발 과정을 담고 있습니다. 저자는 LSM 설계 방식을 심층적으로 탐구했으며, 코드 작성 전반에 걸쳐 대형 언어 모델(LLM)을 협력 파트너로 활용하여 개발을 진행했습니다.
본문
I just left Google BigQuery after four years. The work was interesting, but I missed building things that were mine. I wanted a project that was fun, challenging, and a little ambitious — something where I could learn, play, and reconnect with the part of engineering I got into this for. I also wanted something tangible I could point to and say: I built this, I understand it end to end, come talk to me about databases. So I’m building an LSM storage engine from scratch in Rust. I’m calling it strata . This isn’t a tutorial. It’s what I ran into along the way and what I’m learning from it. Table of contents Open Table of contents Why an LSM Tree? I have an ambition to build a full database from scratch someday. When I looked at the options, LSM trees stood out. Compared to B-trees, they’re easier to reason about — the core data structures are immutable, the write path is append-only, and you don’t have to fight concurrency bugs from the start. I wanted to optimize for correctness and developer experience, not raw performance on day one. There’s also something elegant about the design. Every write is just an append. On-disk files are never modified, only created and deleted. The complexity lives in how you merge and organize data across levels, and that part is genuinely fun to think about. The name comes from geology — layers of rock built up over time. Levels, tiers, strata. It also follows the naming tradition of storage engine projects that sound like they could be indie rock bands. Designing Before Building Before writing code, I spent time reading about LSM design, taking notes, and identifying specific things I wanted to get right early. Some stuff I wanted to explore: - Versioning everything. I’m keeping all versions of every key so I can explore MVCC approaches later. No garbage collection for now. When I eventually integrate a SQL engine on top, I can figure out the best strategy then. LSMs have a cool native way of doing this with internal keys — every key the engine stores is actually a tuple of (user_key, sequence_number, op_type) with a custom comparator that sorts by user key ascending, then sequence number descending. Newest version always comes first, which makes point lookups fast. - Key-value separation. The WiscKey paper has a cool optimization where the LSM tree only stores keys and pointers, and the actual values live in a separate log. I haven’t implemented this yet but it’s on my radar — it could make compaction way cheaper since you’re only merging small key-pointer pairs instead of full values. - Configurable tiering and leveling. The Dostoevsky paper’s approach to hybrid tiering/leveling sounded fun to implement. I made levels configurable with both a max number of runs and a max size, so I can experiment with different strategies just by changing config values. - Manifest files. I like the idea of an append-only manifest log to maintain the structure of the level tree — which SSTables exist, what level they’re in, what key ranges they cover. It’s a log that tracks the shape of your other logs. Very meta. - K-way merge iterator. I found kmerge in theitertools crate and wanted to try it. It uses a min-heap to walk multiple sorted iterators simultaneously — the same primitive you need for both range scans and compaction. I’ll need something more capable if I want to do filtering or predicate pushdown in the future, but it’s a cool fast start. Building It I used Claude as a design partner and code generator throughout. It let me think at the design level — “should the WAL use segments or truncation?” — instead of fighting syntax for an hour. I like working at this level. It feels more productive, and honestly I either spend the time I save on things I value outside of work, or I just end up spending more time coding and engaging with the design. After building it, I wasn’t fully confident I could explain every detail. Not because I didn’t understand the design, but because generating code at speed creates a gap between “I understand the architecture” and “I can trace every line.” So I did something that turned out to be really valuable — I had the AI interview me about my own system. Walk through a write. Walk through a read. Walk through recovery. Explain your decisions. It surfaced real issues. My range scan implementation probably stops too early — it might return results from only the first level that has matches instead of merging across all levels. I also realized my compaction wasn’t handling tombstones correctly in all cases — if a delete and a put for the same key end up in different SSTables, you have to be really careful about the order you process them or you’ll resurrect deleted data. These weren’t things I would have caught from running happy-path tests. They came from having to explain my own system out loud and getting pushed on the details. Embracing Imperfection Perfectionism has cost me a lot. Not just in engineering — in my life. I’ve spent too much time not shippin
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유