Architecture
DSP Protocol
The Apollo's onboard SHARC DSP processors are controlled through two mechanisms: a ring buffer command interface for real-time operations, and a shared SRAM settings window for mixer parameters.
Ring buffer architecture
Each DSP has a 128-byte register bank split into two rings:
| Ring | Offset from bank base | Direction | Purpose |
|---|---|---|---|
| Command | +0x00 | Host to DSP | Send commands, firmware, routing data |
| Response | +0x40 | DSP to host | Receive completion status, readback |
DSP bank addresses
| DSP | Bank base |
|---|---|
| DSP 0 | 0x2000 |
| DSP 1 | 0x2080 |
| DSP 2 | 0x2100 |
| DSP 3 | 0x2180 |
| DSP 4–7 | 0x5E00 + idx * 0x80 |
Ring registers (per ring)
| Offset | Name | Description |
|---|---|---|
0x00–0x1C | Page addresses | Up to 4 DMA page addresses (low/high pairs) |
0x20 | Write pointer | Host write position |
0x24 | Doorbell | Write triggers DSP processing |
0x28 | Position | Hardware current position (read-only) |
Ring entry format
Each ring entry is 16 bytes (4 x u32):
Inline command:
word0: command_code
word1: parameter
word2: 0
word3: 0
DMA reference (BIT31 set in word0):
word0: (size_in_dwords | 0x80000000)
word1: 0
word2: bus_address_low
word3: bus_address_high
DMA references point to coherent DMA buffers in host memory. The FPGA reads the referenced data via PCIe DMA.
Ring capacity
Each ring page holds 256 entries (4096 bytes / 16 bytes per entry). DSP 0 uses 4 pages (1024 entries) for its command ring to accommodate firmware loading. Other DSPs use 1 page (256 entries).
Doorbell protocol
The doorbell write order is critical:
- Write all entries to the ring pages
- Issue a write memory barrier (
wmb()) - Write
DOORBELL(offset0x24) with the new write pointer value - Write
WRITE_PTR(offset0x20) with the same value
Writing DOORBELL first, then WRITE_PTR matches the working driver behavior. Reversing the order causes the FPGA to begin DMA reads immediately, and the subsequent doorbell write during an active transfer resets the DMA engine.
Key ring buffer commands
DSP connect (0x00230002)
Sent as an inline command to initialize each DSP after firmware loading:
Phase 1: For each DSP i:
{0x00230002, 1, 0, 0} + doorbell
Phase 2: DSP 0 only (finalize):
{0x00100002, 0, 0, 0} + doorbell
Bus coefficient write (0x001D0004)
Sets mixer bus parameters (fader levels, pan coefficients, send gains):
word0: 0x001D0004 (command type 0x1D, 4 dwords)
word1: bus_id (0x00–0x1F)
word2: sub_param (0=main, 1=CUE1, 2=CUE2, 3=gainL, 4=gainR)
word3: value (IEEE 754 float as u32)
Bus coefficients go through the ring buffer, not through the mixer settings registers. This is a completely separate hardware path.
SRAM config (0x00080004)
Writes data to SHARC DSP SRAM at a specified address:
Entry 1 (inline):
word0: 0x00080004
word1: sram_destination_address
word2: 0
word3: data_size_in_dwords
Entry 2 (DMA reference):
word0: (size_in_dwords | 0x80000000)
word1: 0
word2: dma_address_low
word3: dma_address_high
Both entries are submitted as a single atomic pair with one doorbell. Used for routing table and module configuration writes.
Firmware loading
Firmware is loaded to DSP 0 via DMA:
- Allocate a single contiguous DMA buffer containing:
[cmd_word, count_word, param_word, firmware_data...] - Allocate a response DMA buffer
- Submit one DMA reference entry to the response ring, ring its doorbell
- Submit one DMA reference entry to the command ring, ring its doorbell
- Poll the response buffer for completion (upper 16 bits of response word 0 ==
0x8004)
Command buffer layout
word[0]: 0x40120000 (FW_CMD | LARGE_FLAG)
word[1]: fw_dwords + 2 (total count)
word[2]: 0x80040000 (FW_PARAM)
word[3..N]: firmware binary data
The FPGA processes the entire payload as a single DMA reference entry. Chained entries do not work for firmware loading.
Timeout
Firmware loading polls for up to 30 seconds at 10ms intervals. Typical completion time is under 5 seconds.
sendBlock protocol
For sending data blocks (routing tables, module activation, PostConnect descriptors) to any DSP:
- Response ring: Submit a 4-dword DMA reference pointing to a response buffer. Ring doorbell.
- Command ring: Submit a header DMA reference (1 dword:
(data_dwords + 1) | cmd) - Command ring: Submit data as DMA reference entries, split at 1 KB chunk boundaries
- Command ring: Single doorbell for all command entries (batch submission)
- Poll: Check response buffer for non-zero word (completion)
The response ring entry must be submitted before the command ring entries — the FPGA needs to know where to write its response before processing the command.
ACEFACE connect sequence
Apollo devices that use the AudioExtension path (including Apollo x4, x6, and all Gen 2 models) use a register-based handshake instead of ring buffer connect commands:
- Write
0xACEFACEtoNOTIF_ACEFACE(0xC02C) - Write
0x01toAX_CONNECT(0x2260) — doorbell - Poll
NOTIF_STATUS(0xC030) for bit 21 (CONNECT) to be set - Read notification status bits for IO descriptor readiness
- Write config words to
NOTIF_CONFIG(0xC000): bank shift, readback count - Read IO channel descriptors from fixed SRAM offsets
After ACEFACE completes, the DSP ring buffers and DMA engine are initialized.
Settings batch protocol
Mixer settings (monitor volume, preamp flags, mute, dim) use the shared SRAM window at BAR0 + 0x3800, not the ring buffer.
Sequence counter handshake
The protocol uses two sequence counters for synchronization:
| Register | Offset | Direction |
|---|---|---|
MIXER_SEQ_WR | 0x3808 | Host writes, DSP reads |
MIXER_SEQ_RD | 0x380C | DSP writes, host reads |
Write procedure
- Wait for DSP idle: Poll
SEQ_RDuntil it equalsSEQ_WR(DSP has processed previous batch) - Write all settings: Write paired (value, mask) words to all 38 setting registers
- Bump sequence counter: Increment
SEQ_WRand write it - Wait for acknowledgment: Poll
SEQ_RDuntil it matches the newSEQ_WRvalue
Why batch writes matter
The DSP processes all settings as an atomic batch when the sequence counter changes. Writing individual settings with per-setting sequence bumps crashes the DSP. All 38 settings must be cached and written together with a single SEQ_WR increment.
A clock write with source value 0x0C is used to activate DSP settings processing.
Setting value encoding
Each setting occupies 8 bytes (2 x u32) with paired value/mask encoding:
wordA = (changed_mask[15:0] << 16) | value[15:0]
wordB = (changed_mask[31:16] << 16) | value[31:16]
If the mask bits are zero, the DSP ignores the corresponding value bits. This allows selective field updates within a setting without disturbing other fields.
Word order
The DSP's interpretation of wordA vs wordB varies per setting. During the batch write, both words are always written for every setting. The upper 16 bits of each word are a changed-bits mask — the DSP uses the mask to decide which fields to process. Writing only one word for a range of settings can crash or freeze the DSP.
Plugin chain activation
The plugin chain is a sequence of 1,317 ring buffer commands captured from the Windows UAD driver that configures the DSP processing graph. It includes seven command types:
SRAM_CFGcommands — write routing configuration to SRAMROUTINGcommands — configure audio routing connectionsMODULE_ACTIVATEcommands — start DSP processing modulesBUS_COEFFcommands — set initial bus coefficient valuesSYNCHcommands — synchronize module statesDMA_REFcommands — configure DMA referencesWAKEcommands — wake DSP modules
These commands are sent in batches of 64 to DSP 0, with the ring reprogrammed between batches. The no_plugins module parameter controls whether this sequence is sent (default: disabled).
The plugin chain is currently disabled because replaying the Windows snapshot corrupts the DSP's capture routing state: it sets setting[24] = 0x20, which clobbers capture channel routing. Basic audio routing — playback, capture, monitor, and preamp control — works without plugin chain activation because the DSP firmware initializes default routing on cold boot.
The plugin chain is the foundation for future UAD DSP plugin (UADx) support. Rather than replaying the Windows snapshot, it will need to be reconstructed from the macOS init sequence to avoid the capture routing conflict.
Routing table loading
Audio routing (which inputs map to which outputs, DMA channel assignments) is configured via sendBlock with routing table data. The driver maintains static routing tables for each direction:
- Record routing: 22 channels — maps ADC/mixer outputs to DMA capture channels
- Playback routing: 24 channels — maps DMA playback channels to DAC/mixer inputs
Routing tables are loaded after DSP connect and before transport start.
Further reading
- Register Map — complete BAR0 register layout
- Architecture Overview — system diagram and hardware write paths