Running Bare-Metal Rust Alongside ESP-IDF on the ESP32-S3's Second Core

hackernews | | 💼 비즈니스
#android #chatgpt #codex #openai #sora #개발 팁
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

OpenAI는 자체 AI 코딩 에이전트인 Codex를 활용하여 단 28일 만에 'Sora' 안드로이드 앱을 성공적으로 구축하고 글로벌 출시했습니다. 4명으로 구성된 소규모 엔지니어링 팀이 약 50억 토큰을 소모하여 프로토타입부터 완성까지 진행한 결과, 앱은 출시 첫날 구글 플레이스토어 1위를 차지했으며 사용자들은 24시간 내에 100만 개 이상의 동영상을 생성했습니다. 또한, 출시 과정에서 99.9%의 높은 크래시 프리 비율을 기록하며 신뢰성과 안정성을 겸비한 성능을 입증했습니다.

본문

Running Bare-Metal Rust Alongside ESP-IDF on the ESP32-S3's Second Core I've been working with the RP2350 and no_std Rust for a while now, and I've really come to appreciate how Rust is designed â safe yet surprisingly straightforward. But my latest project needs Wi-Fi and BLE, and the RP2350 doesn't have wireless hardware built in. That meant switching to the ESP32-S3. The ESP32-S3 is a great chip, but here's the catch: most Wi-Fi and Bluetooth functionality lives inside Espressif's ESP-IDF framework, which is a C-based SDK built on top of FreeRTOS. There are community Rust wrappers for parts of ESP-IDF, and Espressif themselves offer some Rust support, but both are a moving target â documentation is sparse compared to the mature C API, and there's always one or two critical features missing. So I was stuck choosing between two imperfect options: - Go all-in on Rust. I'd get the language features and crates I love, but the no_std ecosystem on ESP32-S3 is still young. In a shipping product, I didn't want to risk hitting undefined behavior in an immature HAL at 2 AM. - Go all-in on ESP-IDF (C). I'd get battle-tested Wi-Fi and BLE stacks, but I'd be writing C for everything â including the business logic, audio processing, and data handling where Rust really shines. Then I remembered something: the ESP32-S3 has two CPU cores. There's an option buried in ESP-IDF's Kconfig called CONFIG_FREERTOS_UNICORE . When you enable it, FreeRTOS only runs on Core 0. Core 1 just... sits there, stalled, doing nothing. That got me thinking: what if I let ESP-IDF own Core 0 for all the Wi-Fi, BLE, and system tasks, and then wake up Core 1 to run my own bare-metal Rust code â completely outside the RTOS? Both cores share the same memory space, so passing data between them should be straightforward (though it does require some unsafe Rust). And since Core 1 wouldn't be managed by FreeRTOS, there'd be no scheduler preempting my time-critical audio processing loop. After convincing myself this wasn't completely insane, I got to work. Here's how it all fits together. Before diving in, it's worth addressing the obvious question: ESP-IDF already provides xTaskCreatePinnedToCore , which can pin a task to a specific core: // FreeRTOS provides this function to create a task on a specific core. // You could pin a Rust function to Core 1 this way â but FreeRTOS // would still manage the scheduler on that core. BaseType_t xTaskCreatePinnedToCore( TaskFunction_t pvTaskCode, // Function that implements the task const char * const pcName, // Human-readable name for debugging const uint32_t usStackDepth, // Stack size in words (not bytes) void * const pvParameters, // Arbitrary pointer passed to the task UBaseType_t uxPriority, // Priority (higher = more CPU time) TaskHandle_t * const pvCreatedTask, // Output: handle to the created task const BaseType_t xCoreID // 0 = PRO core, 1 = APP core ); You could absolutely compile your Rust code as a static library, export a pub extern "C" fn , and have FreeRTOS run it on Core 1 via this API. The ESP-IDF build system would statically link your Rust .a file into the firmware. The problem is that FreeRTOS's scheduler is still running on Core 1. Your task can be preempted at any time by higher-priority tasks or system ticks. For a high-performance audio processing loop where every microsecond of jitter matters, that's a non-starter. I needed a guarantee that nothing would interrupt my code once it started running. By disabling FreeRTOS on Core 1 entirely (via CONFIG_FREERTOS_UNICORE=y ), we get an empty CPU that we can control directly at the hardware level â no scheduler, no context switching, no surprises. Let's start with the simpler approach: building Rust as a static library, linking it into the ESP-IDF firmware at compile time, and manually booting Core 1 to run it. This is the foundation everything else builds on. When Core 1 wakes up outside of FreeRTOS, it doesn't get a dynamically allocated stack from the OS â because there is no OS on that core. We need to manually set aside a chunk of RAM that ESP-IDF's heap allocator won't touch. ESP-IDF provides the SOC_RESERVE_MEMORY_REGION macro for exactly this. It tells the bootloader and memory allocator to treat a specific address range as off-limits: #include "heap_memory_layout.h" // Reserve 128KB of internal SRAM for Core 1's stack and data. // The two hex values define the start and end addresses of the reserved region. // 0x3FCE9710 - 0x3FCC9710 = 0x20000 = 131072 bytes = 128KB. // "rust_app" is just a label for debugging â it shows up in boot logs. SOC_RESERVE_MEMORY_REGION(0x3FCC9710, 0x3FCE9710, rust_app); Why 128KB? It's a reasonable default for an embedded stack plus some working memory. You can adjust this range depending on how much RAM your Rust code needs â just make sure the addresses fall within the ESP32-S3's internal SRAM region and don't overlap with anything ESP-IDF is using. This is the main ESP-IDF application running on Core 0. Its job is to: - Set up the system (Wi-Fi, peripherals, etc. â or in our test case, just boot). - Wake up Core 1 and point it at our Rust code. - Go about its normal FreeRTOS business. Instead of using xTaskCreatePinnedToCore , we're talking directly to the ESP32-S3's hardware registers to boot Core 1. We set a boot address, enable the clock, release the stall, and pulse the reset line. Core 1 wakes up completely independent of FreeRTOS. To verify that everything is working, Core 0 will read a shared counter variable (RUST_CORE1_COUNTER ) that the Rust code on Core 1 increments in a loop. #include #include #include "esp_log.h" #include "esp_cpu.h" #include "heap_memory_layout.h" #include "freertos/FreeRTOS.h" #include "freertos/task.h" #include "soc/system_reg.h" #include "soc/soc.h" static const char *TAG = "rust_app_core"; // Reserve memory so ESP-IDF's heap allocator doesn't use it. // (Same macro from Step 1 â it must appear in a compiled C file.) SOC_RESERVE_MEMORY_REGION(0x3FCC9710, 0x3FCE9710, rust_app); // ---- External symbols ---- // These are defined in other files and resolved at link time: // rust_app_core_entry â the Rust function (from our .a library) // app_core_trampoline â tiny assembly stub that sets the stack pointer // _rust_stack_top â address from our linker script (top of reserved 128KB) // ets_set_appcpu_boot_addr â ROM function that tells Core 1 where to start extern void rust_app_core_entry(void); extern void ets_set_appcpu_boot_addr(uint32_t); extern uint32_t _rust_stack_top; extern void app_core_trampoline(void); /* * Boot Core 1 by directly manipulating ESP32-S3 hardware registers. * This bypasses FreeRTOS entirely â Core 1 will run our code with * no scheduler, no interrupts (unless we set them up), and no OS. */ static void start_rust_on_app_core(void) { ESP_LOGI(TAG, "Starting Rust on Core 1..."); ESP_LOGI(TAG, " Stack: 0x3FCC9710 - 0x3FCE9710 (128K)"); /* 1. Tell Core 1 where to begin executing after it resets. * This ROM function writes the address into a register that the * CPU reads on boot. We point it at our assembly trampoline. */ ets_set_appcpu_boot_addr((uint32_t)app_core_trampoline); /* 2. Hardware-level wake-up sequence for Core 1. * These register writes control the clock, stall, and reset * signals for the second CPU core. */ // Enable the clock gate â Core 1 can't run without a clock signal. SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG, SYSTEM_CONTROL_CORE_1_CLKGATE_EN); // Clear the RUNSTALL bit. While stalled, the core is frozen mid-instruction. CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG, SYSTEM_CONTROL_CORE_1_RUNSTALL); // Pulse the reset line: assert it, then immediately de-assert. // This causes Core 1 to reboot and jump to the address we set above. SET_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG, SYSTEM_CONTROL_CORE_1_RESETING); CLEAR_PERI_REG_MASK(SYSTEM_CORE_1_CONTROL_0_REG, SYSTEM_CONTROL_CORE_1_RESETING); ESP_LOGI(TAG, "Core 1 released"); } // This counter lives in the Rust code. Because it's an AtomicU32 with // #[no_mangle], the C linker can find it by this exact name. extern volatile uint32_t RUST_CORE1_COUNTER; void app_main(void) { ESP_LOGI(TAG, "Core 0: Starting IDF app"); // Wake up Core 1 and start the Rust code start_rust_on_app_core(); // Core 0 continues running FreeRTOS as normal. // Here we just monitor the shared counter to prove both cores are alive. while (1) { ESP_LOGI(TAG, "Rust Core 1 counter: %lu", (unsigned long)RUST_CORE1_COUNTER); vTaskDelay(pdMS_TO_TICKS(1000)); // Print once per second } } When a CPU core wakes up from reset, it doesn't have a stack yet. And without a stack, it can't call any C or Rust functions â function calls need somewhere to store return addresses and local variables. The ESP32-S3 uses the Xtensa instruction set architecture, where register a1 serves as the stack pointer. Our tiny assembly stub loads the address of our reserved memory into a1 , then jumps into Rust. That's all it does â just two instructions. We place this code in the .iram1 section, which maps to Internal RAM. This is important because when a core first boots, it may not have flash caching set up yet. Code in IRAM is always accessible. app_core_trampoline.S /* * app_core_trampoline.S * * Minimal startup code for Core 1. Sets the stack pointer to our * reserved memory region, then jumps to the Rust entry point. * * Placed in IRAM (.iram1) so it's available immediately after core * reset, before flash cache is configured. */ .section .iram1, "ax" /* "ax" = allocatable + executable */ .global app_core_trampoline .type app_core_trampoline, @function .align 4 /* Xtensa requires 4-byte alignment */ app_core_trampoline: /* Load the top of our 128KB reserved stack into register a1. * Stacks grow downward on Xtensa, so "top" means the highest * address â the stack will grow toward lower addresses from here. */ movi a1, _rust_stack_top /* Jump to the Rust entry function. call0 is a "windowless" call * (no register window rotation), suitable for bare-metal startup. * This function never returns â it contains an infinite loop. */ call0 rust_app_core_entry .size app_core_trampoline, . - app_core_trampoline ESP-IDF uses CMake as its build system. We need to tell it about three extra things: our assembly file, our pre-compiled Rust library, and a custom linker script that defines where _rust_stack_top lives. CMakeLists.txt # Register our C source and the assembly trampoline as component sources. # ESP-IDF builds each directory under "main/" as a "component." idf_component_register( SRCS "main.c" "app_core_trampoline.S" INCLUDE_DIRS "." ) # Tell the linker about our pre-compiled Rust static library. # This .a file is produced by `cargo build` and copied into main/lib/. add_prebuilt_library(rust_app "${CMAKE_CURRENT_SOURCE_DIR}/lib/libesp_rust_app.a") # Link the Rust library into our component. INTERFACE means anything # that depends on this component also gets the Rust symbols. target_link_libraries(${COMPONENT_LIB} INTERFACE rust_app) # Inject our custom linker script. This is how the assembly trampoline # knows the numeric value of _rust_stack_top. target_link_options(${COMPONENT_LIB} INTERFACE "-T${CMAKE_CURRENT_SOURCE_DIR}/rust_stack.ld") rust_stack.ld /* * Custom linker script fragment. * * Defines _rust_stack_top as the END of our reserved 128KB block. * Stacks grow downward, so the "top" is the highest address. * The assembly trampoline loads this value into register a1. */ _rust_stack_top = 0x3FCE9710; The connection here is: the linker script provides a symbol (_rust_stack_top ) â the assembly trampoline references that symbol to set the stack pointer â the C code triggers the hardware boot sequence that starts Core 1 at the trampoline. Finally, here's the code that actually runs on Core 1. It's entirely no_std â there's no operating system, no allocator, no standard library. Just raw hardware access. The key technique here is AtomicU32 . Atomics are special CPU instructions that read and write memory in a way that's safe even when two cores access the same address simultaneously. By using AtomicU32 for our shared counter, we avoid race conditions without needing a mutex (which wouldn't work easily across the OS/bare-metal boundary anyway). The spin_loop hint tells the CPU "I'm intentionally busy-waiting" â on some architectures this reduces power consumption or yields resources to other hardware threads. Here it also serves as a simple delay so the counter doesn't overflow instantly. // no_std: we're running without the Rust standard library. // There's no OS below us â no heap, no threads, no println!. #![no_std] // no_main: we don't use Rust's normal main() entry point. // Instead, Core 1 enters via rust_app_core_entry(), called from assembly. #![no_main] use core::panic::PanicInfo; use core::sync::atomic::{AtomicU32, Ordering}; // Every no_std binary needs a panic handler. When something goes wrong // (array out of bounds, unwrap on None, etc.), this function is called. // On a bare-metal core with no debugger attached, there's not much we // can do â so we just loop forever. A production system might toggle // an LED or write to a shared error flag that Core 0 can read. #[panic_handler] fn panic(_info: &PanicInfo) -> ! { loop {} } // The shared counter. Both cores can see this variable because it lives // in the same memory space. // // #[unsafe(no_mangle)] prevents Rust from renaming this symbol during // compilation. Without it, Rust would generate something like // "_ZN12esp_rust_app18RUST_CORE1_COUNTER17h..." â and the C code // wouldn't be able to find it by name. // // AtomicU32 ensures that reads and writes are atomic at the CPU level, // so Core 0 will never see a "torn" (half-written) value. #[unsafe(no_mangle)] pub static RUST_CORE1_COUNTER: AtomicU32 = AtomicU32::new(0); // The entry point called by the assembly trampoline after it sets // up the stack pointer. The `-> !` return type means "this function // never returns" â it runs an infinite loop. // // `extern "C"` uses the C calling convention so the assembly code // (and the C linker) can call this function correctly. #[unsafe(no_mangle)] pub extern "C" fn rust_app_core_entry() -> ! { loop { // Atomically increment the counter by 1. // Ordering::Relaxed means we don't need any memory ordering // guarantees beyond the atomicity of this single operation. // (For a simple counter, Relaxed is sufficient.) RUST_CORE1_COUNTER.fetch_add(1, Ordering::Relaxed); // Busy-wait loop as a simple delay. spin_loop() is a CPU hint // that says "I'm spinning, not doing real work" â on some // architectures this saves power or avoids starving other // hardware threads. for _ in 0..1_000_000 { core::hint::spin_loop(); } } } ESP-IDF's build system expects a standard C-compatible static archive (.a file). By default, cargo build produces Rust-specific .rlib files that only the Rust toolchain understands. We need to tell Cargo to output a staticlib instead. We also apply aggressive size optimiza

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →