The SPI Bus Treaty
A four-rule shared-bus arbitration protocol for concurrent SD, LoRa, and display operation on the ESP32-S3 — and the reason the Ghost Engine never stops.
The Espressif ESP32-S3 exposes a single hardware SPI peripheral block that, on the LilyGo T-Deck Plus, T-LoRa Pager, and similar devices, is wired to three independent consumers: the MicroSD card, the SX1262 LoRa radio, and the ST7789/ST7796U display controller. The reference firmware ecosystem for these devices is built around single-purpose use cases where, at any given moment, only one of these consumers is active. Pisces Moon OS is not that. It runs the Ghost Engine — a persistent Core 0 background process that wardrives, scans BLE, and logs GPS continuously — alongside any Core 1 foreground application, with the LoRa mesh radio and the SD card and the display all active simultaneously.
This document specifies the SPI Bus Treaty: a formal four-rule arbitration protocol enforced by a recursive mutex (spi_mutex) that every component of the operating system and every third-party ELF module must obey. The protocol was developed in field conditions in 2025–2026 and is now a platform-level contract. v1.2.0 audit: 54 take sites, 56 give sites verified across 12 files. The Treaty is the architectural precondition for everything else Pisces Moon does.
The Physics of the Conflict
SPI is a synchronous serial protocol with four signal lines — SCK (clock), MOSI (master out), MISO (master in), and an active-low chip-select per peripheral. The ESP32-S3 has two general-purpose SPI peripherals (SPI2 and SPI3) and a third dedicated to flash memory. The LilyGo T-Deck Plus, T-LoRa Pager, and M5Stack Cardputer ADV all wire SD, LoRa, and display to the same SPI peripheral block, distinguished only by chip-select line.
When two consumers attempt to drive the bus simultaneously — even momentarily — the failure modes are not graceful. The SD card's CMD0 initialization sequence is timing-sensitive: a glitch on SCK during command issuance corrupts the response and the controller enters an error state requiring a full power cycle to recover. The LoRa radio's SX1262 SPI command set assumes the bus is quiet for the duration of a transaction; concurrent SD traffic causes the radio to mis-parse its own command queue and enter undefined states. The display controller's pixel-flush sequence relies on tight timing between command bytes and data bytes; SPI contention produces visible tearing, dropped frames, or display freeze.
None of these failures present as “SPI contention.” The Guru Meditation crash dumps point at memory addresses where execution happened to be when the corruption propagated — not at the source of the corruption. The first symptom of an SPI conflict in this OS was a non-deterministic reboot during heavy field operation. The hypothesis that the SPI bus was the shared resource being contested took several months and many field sessions to establish.
Four Field Failures
The Treaty was not designed in advance. It was extracted from four specific failure modes that recurred across field operations in 2025–2026. Each is documented here with the symptom, the root cause, and the rule it produced.
T-Deck Plus running Ghost Engine wardrive + LoRa mesh receiver in a dense RF environment (40+ APs, active LoRa mesh traffic). Crashes occurred at non-deterministic intervals, between 4 and 90 minutes of sustained operation. Crash addresses varied. The bench environment (laboratory with single AP and no LoRa peer) never produced the crash.
Root cause: SD write operations and LoRa receive interrupts both manipulated the SPI peripheral block. With no arbitration, the LoRa interrupt would occasionally fire during an SD transaction, corrupting both. The corruption propagated as memory damage that crashed in arbitrary subsystems.
Produced Rule 1: hit-and-run.
Ghost Engine writing wardrive CSV continuously during a long-form field session. Every 15–25 minutes, the watchdog timer fired and the device rebooted. No crash dump — clean reboot.
Root cause: an SD write extending across multiple sectors held the SPI bus for tens of milliseconds. During this window, the LoRa radio's interrupt handler could not service incoming packets, and the BLE stack's scan callback could not flush its result buffer. The watchdog detected the resulting starvation and rebooted the device as a recovery action.
Produced Rule 2: no extended holds.
The Gemini AI client (Core 1 foreground app) and the Ghost Engine wardrive scanner (Core 0 background) both required WiFi radio access. The ESP32-S3's WiFi peripheral is single-instance: only one consumer can hold it at a time. Both subsystems would block waiting for the radio, producing visible UI freezes and silent wardrive interruptions.
Root cause: WiFi radio arbitration was not explicit. Each subsystem assumed it could use the radio whenever it needed to. The deadlock was probabilistic — short Gemini queries usually completed without conflict, but a single long inference would block the wardrive long enough for the watchdog to intervene.
Produced Rule 3: radio-traffic mutual exclusion via shared boolean flag.
The Ghost Partition Nuke — sector-level zero-overwrite of sensitive data files before FAT unlinking — must hold the SPI bus for the duration of the wipe, in violation of Rule 2 (no extended holds). The naive implementation would crash the LoRa radio mid-wipe, corrupting both the wipe and the radio state. The wipe could not be safely executed during normal operation.
Root cause: the operation that needs the bus most exclusively is also the most sensitive to interruption. Halting the wipe partway through leaves the data partially-destroyed and partially-recoverable — the worst possible outcome.
Produced Rule 4: destructive operations under Treaty discipline at boot, before Ghost Engine spawn.
The Four Rules
The Treaty consists of four rules. They are binding on first-party OS code, third-party ELF modules loaded at runtime from SD, and any future hardware target. Violation produces immediate crashes, watchdog reboots, or test failures. The rules are short enough to be remembered and specific enough to be checked.
spi_mutex) permits nested operations within a single component, but the outermost component is responsible for releasing the same number of times it acquired.wifi_in_use boolean. The flag is set by the consumer claiming the radio and cleared on release. Consumers check the flag before claiming; if set, they yield and retry. On the Cardputer ADV (no PSRAM), this rule is extended into full WiFi mode-locking with hard teardown — CLIENT and SCANNER modes are mutually exclusive at the OS level, with full esp_wifi_stop() + esp_wifi_deinit() between transitions.spi_mutex discipline at boot, before the Ghost Engine task spawns. During this boot window, no other SPI consumer is active. The operation has exclusive access to the bus for as long as it needs. The Ghost Engine starts only after all destructive operations have completed and the bus has returned to normal Treaty governance.v1.2.0 Audit Results
The Treaty is enforced by recursive mutex (spi_mutex), not by static analysis. Every site that touches the SPI bus must call spi_mutex_take() before its first transaction and spi_mutex_give() after its last. A manual audit was conducted for v1.2.0 to verify that every take has a matching give, no operations exceed the 50ms hold budget, and no consumer holds the bus during a yield.
| File | Takes | Gives | Notes |
|---|---|---|---|
| core/ghost_engine.cpp | 12 | 12 | Wardrive scan write loop, BLE callback flush |
| core/wardrive_task.cpp | 8 | 8 | Session file creation + per-frame write |
| radio/lora_driver.cpp | 6 | 6 | SX1262 command queue + interrupt handler |
| radio/sd_filesystem.cpp | 9 | 10 | One extra give covers a rare recovery path |
| ui/display_driver.cpp | 7 | 7 | Pixel flush + command sequencing |
| boot/late_sd_init.cpp | 4 | 4 | Cold SPI restart for T-LoRa Pager |
| security/nuke.cpp | 3 | 3 | Boot-time only, exclusive bus access |
| elf/sandbox.cpp | 2 | 2 | ELF module load + verify |
| apps/recorder.cpp | 1 | 1 | Audio capture write |
| apps/mesh_messenger.cpp | 1 | 1 | Message persistence |
| apps/audio_player.cpp | 1 | 1 | WAV streaming read |
| apps/elf_browser.cpp | 0 | 1 | Defensive give in error path |
| Total | 54 | 56 | Two extra gives are defensive recovery paths |
The two extra gives are intentional defensive code: the SD filesystem driver and ELF browser each contain a recovery path that releases the mutex in error conditions where a previous acquire may not be tracked. The recursive mutex tolerates extra gives without harm.
Prior Art & Precedent
The technique of mutex-arbitrated SPI is not new. Every operating system kernel that drives more than one SPI peripheral simultaneously uses some form of arbitration. The contribution of the SPI Bus Treaty is not the technique — it is the protocol: the named, documented, audited contract that every component must obey, published as a public reference for any future project on this hardware class.
The Treaty stands in a tradition of named arbitration protocols:
- Unix filesystem locking conventions (1970s) — established formal rules for competing processes sharing a filesystem. The conventions were not new techniques (file locks predate Unix); the contribution was naming the rules and making them platform-wide.
- Apollo Guidance Computer priority scheduling protocol (1969) — formal rules for competing tasks sharing a processor under load. The 1202 alarm during the lunar landing was the protocol working as designed: the lower-priority task was sacrificed to preserve the higher-priority one.
- Nintendo N64 RSP time budget (1996) — formal rules for competing subsystems sharing the Reality Signal Processor. The budget was not new (time-slicing predates the N64); the contribution was the documented protocol that every game developer had to follow.
In each case the solution was not a patch — it was a protocol. A named standard. A thing that could be taught, documented, and enforced across an entire platform. The SPI Bus Treaty is the application of this approach to the ESP32-S3 hardware class. To the best of our knowledge, no prior published reference for this protocol exists on this chip family. If reviewers can identify prior art, we will document and credit it — this paper is offered with that invitation.
Enforcement & Contract
The Treaty is enforced by three mechanisms that work together:
- The recursive mutex (
spi_mutex). Every SPI consumer must acquire the mutex before its first transaction and release after its last. The mutex is recursive, so nested operations within a single component do not deadlock. The mutex is mandatory — there is no “take if available” path. A consumer that cannot acquire the mutex must yield and retry. - Audit. The v1.2.0 audit (Section IV) verified that every take has a matching give. The audit is repeated for every release. Future versions will add static analysis to enforce the pattern at compile time.
- Documentation. The Treaty is documented in this paper and in the source tree. Third-party ELF module developers are required to follow the Treaty as a condition of running on the platform. The Treaty is the public contract — any module that violates it is a bug in that module, not in the OS.
The Treaty is what makes the Ghost Engine possible. The Ghost Engine is the entire reason Pisces Moon OS exists. The relationship between the two is not incidental — the OS was designed around the Engine, and the Engine could not run without the Treaty. The treaty doesn't change. The hardware around the treaty does. The same four rules protect the bus on the T-Deck Plus (8 MB PSRAM), the T-LoRa Pager (8 MB PSRAM, deferred boot), and the Cardputer ADV (no PSRAM, mode-locked WiFi). The protocol generalized to the entire memory tier of the ESP32-S3 chip class.
Two-device live-traffic validation (bench, May 20, 2026): Mesh Messenger send/receive confirmed between T-Deck Plus and Cardputer ADV with Cap LoRa-1262, both on the LongFast channel, concurrent with active wardrive logging on both devices. SD transcripts written to /mesh_logs/messages.csv on each device while WiFi scan results were simultaneously written to wardrive CSV. Both Rule 1 (hit and run) and Rule 2 (no extended holds) survived contact with real LoRa interrupt traffic interleaved with SD writes on the same bus. The Treaty held across the 8 MB PSRAM and no-PSRAM ends of the chip tier simultaneously — proof that the protocol is correctly specified, not merely correctly observed.
The contribution of this document is the protocol itself — named, documented, audited, and offered as a starting point for any future project on this hardware class that needs to coordinate concurrent peripheral access on a shared SPI bus. The protocol is open source under AGPL-3.0. The reference implementation is at github.com/FluidFortune/pisces-moon-os.
The SPI Bus Treaty — v1.2.0 — May 19, 2026
Eric Becker · Fluid Fortune · forge@fluidfortune.com
Companion documents: white paper · engineering record · device architecture
Source: github.com/FluidFortune/pisces-moon-os
Licensed under AGPL-3.0.