Sandlock 대 컨테이너: 25% 더 빠름

hackernews | | 🔬 연구
#review #sandlock #네트워크 오버헤드 #벤치마크 #성능비교 #컨테이너
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

동일 호스트 내에서 Redis를 이용한 성능 비교 실험 결과, 컨테이너 네트워크의 복잡한 라우팅 과정과 달리 Sandlock은 호스트 네트워크 스택을 공유해 오버헤드를 최소화했습니다. 측정 결과 Sandlock은 Docker 대비 처리량이 25% 높고, 평균 및 99번째 백분위수 레이턴시가 각각 50%와 66% 더 우수하여 베어메탈 성능의 88%를 달성했습니다. 이는 실시간 처리나 캐싱 같은 민감한 워크로드에서 가상화 기술의 성능 '세금'을 피하는 효과적인 대안임을 시사합니다.

본문

Every message sent to a containerized service on the same machine pays a tax. It traverses iptables DNAT rules, a Linux bridge, and a virtual Ethernet device before it reaches the process inside. For large file transfers, the tax is invisible. For the workloads that define modern infrastructure (real-time stream processing, in-memory caching, sidecar communication), it is the single largest source of overhead. We measured this tax using Redis, and the results surprised us. Benchmark Setup We ran a Redis 8.6 server inside each isolation environment while redis-benchmark ran directly on the host, connecting to the server. This models the common deployment pattern where external clients or co-located services connect to a confined server process. The identical Redis binary (/usr/bin/redis-server ) was used in all three configurations. For Docker, the host binary and its libraries were bind-mounted into the container, eliminating version differences as a variable. Persistence was disabled across all tests (--save "" , --appendonly no ) to isolate network and processing overhead from disk I/O. Three configurations tested: - Bare metal. Redis server runs directly on the host. No isolation. The benchmark client connects over localhost. This establishes the performance ceiling. - Sandlock. Redis server runs inside a process sandbox with real security restrictions: - Landlock filesystem confinement: read access to system libraries and /dev ; write access limited to/tmp . - Landlock network restrictions: net_bind andnet_connect locked to the Redis port only. - Seccomp-bpf: default deny list blocking 34 dangerous syscalls ( mount ,ptrace ,io_uring ,bpf , and others). - Argument-level seccomp filtering on prctl ,ioctl , andclone to block specific dangerous operations while allowing safe usage. - No root privileges. No namespaces. No container runtime. The benchmark client connects over localhost. Both server and client share the host network stack. - Landlock filesystem confinement: read access to system libraries and - Docker. Redis server runs in a container with the default bridge network and port mapping ( -p 16379:16379 ). The benchmark client connects through the mapped port. Traffic traverses the veth pair, the Docker bridge, and the netfilter/conntrack rules that Docker configures for port forwarding. Each configuration was tested for three rounds with 50 concurrent clients, 100,000 requests, and 256-byte values. Results were averaged. The Numbers | SET ops/sec | GET ops/sec | SET p50 | SET p99 | GET p50 | GET p99 | Combined | | |---|---|---|---|---|---|---|---| | Bare metal | 81,229 | 78,342 | 0.316 ms | 0.631 ms | 0.327 ms | 0.540 ms | 100% | | Sandlock | 70,777 | 69,967 | 0.327 ms | 0.911 ms | 0.327 ms | 0.850 ms | 88.2% | | Docker | 56,210 | 56,639 | 0.498 ms | 1.471 ms | 0.498 ms | 1.447 ms | 70.7% | Three things stand out. Throughput. Sandlock delivers 140,744 combined ops/sec. Docker delivers 112,849. That is 25% more operations per second for the same workload on the same hardware. Sandlock retains 88% of bare metal performance; Docker retains 71%. Median latency. Sandlock: 0.33 ms. Docker: 0.50 ms. Docker adds 0.17 ms to every request at the median. That is 50% higher than Sandlock, which is within 3% of bare metal. Tail latency. Sandlock: 0.88 ms at p99. Docker: 1.46 ms. Docker’s 99th percentile is 66% higher. For systems bound by SLAs at the 99th percentile, this is the number that determines whether you meet your contract or breach it. Two Paths Through the Kernel Where does the 25% gap come from? It is not a tuning issue. It is a consequence of how each technology routes packets. When a client sends a request to a Docker container on the same host, the packet takes this path: Client --> host TCP --> netfilter DNAT --> bridge --> veth --> container TCP --> Redis Docker uses iptables rules for port mapping. Every packet hits a conntrack lookup in the PREROUTING chain (the NAT decision is cached after the first packet, but the lookup itself is per-packet). The bridge performs MAC-level forwarding. The veth pair transfers the packet between network namespaces, adding a netdev traversal on each side. At 50 concurrent clients generating thousands of small requests per second, these costs compound. When a client sends a request to a Sandlock-confined process: Client --> loopback --> Redis There is no virtual device. No bridge. No netfilter evaluation. Both processes share the host network stack. The kernel’s loopback path delivers the packet directly. Sandlock’s security enforcement operates at the syscall boundary, not at the packet level. Landlock restricts which TCP ports a process may bind() or connect() to, checked once at connection time. The data path syscalls (sendmsg , recvmsg , read , write ) pass through the seccomp-bpf filter in nanoseconds (arch check, arg filter skip, syscall number match) and proceed directly to the kernel’s TCP implementation. There is no per-packet overhead beyond the BPF

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →