HN 표시: Threadprocs – 하나의 주소 공간을 공유하는 실행 파일(0-복사본 포인터)

hackernews | | 📦 오픈소스
#threadprocs #메모리 공유 #실험적 코드 #주소 공간 #프로세스
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Threadprocs는 여러 실행 프로그램이 하나의 가상 주소 공간을 공유하며 서로의 포인터를 직접 참조할 수 있는 실험적인 POSIX 프로세스 모델입니다. 이 접근 방식은 데이터 복사 없이 포인터 기반 구조체에 접근하는 제로 카피(zero-copy) IPC를 가능하게 하며, 각 프로그램은 독립된 libc 인스턴스를 가지지만 메모리는 공유합니다. 현재 aarch64 및 x86_64 아키텍처의 리눅스 환경에서 작동하며, 힙 메모리 할당자의 차이나 ptrace 디버깅 미지원과 같은 몇 가지 기술적 제약이 있습니다.

본문

This repository contains experimental code for thread-like processes, or multiple programs running in a shared address space. Each threadproc behaves like a process with its own executable, globals, libc instance, etc, but pointers are valid across threadprocs. This blends the Posix process model with the Posix multi-threading programming model, and enables things like zero-copy access to pointer-based data structures. All Markdown files were written by hand. See tproc-actors for one possible application framework building on top of threadprocs. The code for the demoed programs is at example/sharedstr/allocstr.cpp and example/sharedstr/printstr.cpp , and neither contains any magic (/proc/[pid]/mem , etc), nor awareness of the server and launcher. allocstr reads input, and copies it into a newstd::string , and prints&newstring to console.printstr reads a pointer as hex text, and prints whateverstd::string it finds there. demo.mp4 The server utility "hosts" a virtual address space, and by using launcher to start programs, those launched programs coexist in the hosted address space. Applications can share pointers in the virtual address space through some out-of-band mechanism (Demo uses copy/paste, dummy_server/client uses sockets, libtproc provides server-global scratch space), and then directly dereference those pointers, as they're valid in the shared address space. libtproc provides basic detection of execution as a threadproc, and allows hosted threadprocs to access a "server-global" scratch space. Applications can build tooling using this space to implement service discovery and bootstrap shared memory-backed IPC. This is implemented by adding another entry to the threadproc auxv. tproc-actors uses this space to advertise per-threadproc actor registries. - Proof of concept in examples/ andtest/ - aarch64+x86_64 Linux - Production quality - Secure(ish) - Documentation - Tooling for peer/service discovery (basic) Use Linux on aarch64 or x86_64; other architectures are not supported. This was developed in a VM running Debian on a Macbook Air M1, and also tested in a Debian x86_64 Github Codespace using the .devcontainer/ configuration. Dependencies: apt install build-essential liburing-dev # May need to install gcc 14+ git submodule update --init Notably there is no dependency no ELF libraries aside from Linux system headers, though those would probably make the code nicer. Building: make Run auto integration tests: make test Or run your own programs in a shared address space: ./buildout/server /tmp/mytest.sock & ./buildout/launcher /tmp/mytest.sock program1 arg1 arg2 & ./buildout/launcher /tmp/mytest.sock program2 arg3 arg4 Read the overview or implementation for information on the project, or read comparisons to existing work. I've also collected some lessons learned in conclusions. - Each threadproc has its own runtime library instance (libc), and care must be taken not to call malloc() in one threadproc but try tofree() that memory in another threadproc. - Target applications must be compiled as "position independent code," as do any dynamically loaded objects. - This is standard for dynamically linked libraries, and default for executable binaries compiled in many modern distros in order to support flavors of ASLR. - Properly architected libraries can mitigate most drawbacks of this, and executable files also carry minimal overhead. brk() (andsbrk() ) cannot be used reliably, because they are "address space global" to the kernel, and processes typically assume they won't be called from unexpected places.- The server sets the MALLOC_MMAP_THRESHOLD_=0 environment variable for children to avoid the default glibc behavior and avoid these calls. - The server sets the mmap withMAP_FIXED can't be used without first "reserving" a non-fixed mapping.- This is generally true of any program, and "unreserved" MAP_FIXED use is unsafe even in standard Linux programs. - See the manpage section Using MAP_FIXED safely - This is generally true of any program, and "unreserved" - Debugging and ptrace() are not supported.- It may be possible to add partial support, but I suspect GDB makes some assumptions that would be difficult to satisfy - The threadproc's PID is not the same as the launching process, so operations in terms of PID may lead to issues if applications rely on details of PID-targeted operations. - Signals are forwarded from the launcher to the threadproc, but unhandle-able signals (SIGKILL) are not. - There are likely other edge cases if a threadproc relies on details of the Posix signal behavior. There are other less pertinent limitations around the edges. For example, threadprocs have /proc/[pid]/comm values which reflect their launched binary, but cmdline isn't settable. exec() syscalls also "escape" the threadproc scheme, which is probably desired may cause subtle issues. My initial vision was for threadprocs to pass std::unique_ptr s to each other, and support IPC with nested data. ABI aside, the major

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →