ring.h — Fixed-Size Ring Buffer
Introduction
ring.h provides xRingBuffer, a fixed-capacity circular buffer that never reallocates. It is ideal for bounded producer-consumer scenarios where a fixed memory budget is required. The capacity is rounded up to the next power of two internally, enabling bitmask indexing instead of expensive modulo operations.
Design Philosophy
-
Fixed Capacity, Zero Reallocation — Once created, the ring buffer never grows. Writes that exceed capacity return
xErrno_NoMemory. This makes memory usage predictable and avoids allocation latency spikes. -
Power-of-Two Masking — The internal capacity is always a power of two. Index computation uses
head & maskinstead ofhead % cap, which is significantly faster on most architectures. -
Monotonic Cursors —
head(write) andtail(read) grow monotonically and never wrap. The actual array index is computed via bitmask. This simplifies the full/empty distinction:head - tailgives the exact readable byte count. -
Single Allocation — Like
xBuffer, the header and data area are allocated together using a flexible array member. -
Scatter-Gather I/O — The ring buffer provides
ReadIov/WriteIovhelpers that filliovecarrays for efficientreadv()/writev()syscalls, handling the wrap-around transparently.
Architecture
graph LR
PRODUCER["Producer"] -->|"xRingBufferWrite"| RB["xRingBuffer<br/>(fixed capacity)"]
RB -->|"xRingBufferRead"| CONSUMER["Consumer"]
RB -->|"xRingBufferReadIov"| IOV1["iovec[2]"] -->|"writev()"| FD1["fd"]
FD2["fd"] -->|"readv()"| IOV2["iovec[2]"] -->|"xRingBufferWriteIov"| RB
style RB fill:#f5a623,color:#fff
Implementation Details
Memory Layout
Single malloc() allocation:
┌───────────────────────┬──────────────────────────────────────┐
│ xRingBuffer_ header │ data[cap] (flexible array member) │
│ cap, mask, head, tail│ │
└───────────────────────┴──────────────────────────────────────┘
Circular data layout (cap=8, mask=7):
tail & mask head & mask
↓ ↓
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ │ │ R │ R │ R │ W │ │ │
└───┴───┴───┴───┴───┴───┴───┴───┘
0 1 2 3 4 5 6 7
R = readable data (tail..head)
W = next write position
Internal Structure
XDEF_STRUCT(xRingBuffer_) {
size_t cap; // Capacity (power of two)
size_t mask; // cap - 1 (for bitmask indexing)
size_t head; // Write cursor (monotonic)
size_t tail; // Read cursor (monotonic)
char data[];// Flexible array member
};
Power-of-Two Rounding
static size_t next_pow2(size_t v) {
if (v < 16) v = 16;
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
// v |= v >> 32; (on 64-bit)
return v + 1;
}
This ensures cap is always a power of two, so mask = cap - 1 produces a valid bitmask. For example, cap = 8 → mask = 0b111.
Bitmask Indexing
Instead of:
size_t idx = head % cap; // Expensive division
The ring buffer uses:
size_t idx = head & mask; // Single AND instruction
This works because cap is a power of two: x % (2^n) == x & (2^n - 1).
Wrap-Around Write
flowchart TD
WRITE["xRingBufferWrite(rb, data, len)"]
CHECK{"len <= writable?"}
FAIL["Return xErrno_NoMemory"]
POS["pos = head & mask"]
FIRST["first = cap - pos"]
WRAP{"len <= first?"}
SINGLE["memcpy(data+pos, src, len)"]
SPLIT["memcpy(data+pos, src, first)<br/>memcpy(data, src+first, len-first)"]
ADVANCE["head += len"]
WRITE --> CHECK
CHECK -->|No| FAIL
CHECK -->|Yes| POS --> FIRST --> WRAP
WRAP -->|Yes| SINGLE --> ADVANCE
WRAP -->|No| SPLIT --> ADVANCE
style FAIL fill:#e74c3c,color:#fff
style ADVANCE fill:#50b86c,color:#fff
Operations and Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
xRingBufferWrite | O(n) | Up to 2 memcpy calls |
xRingBufferRead | O(n) | Up to 2 memcpy calls |
xRingBufferPeek | O(n) | Like Read but doesn't advance tail |
xRingBufferDiscard | O(1) | Just advances tail |
xRingBufferLen | O(1) | head - tail |
xRingBufferReadFd | O(1) | Single readv() syscall |
xRingBufferWriteFd | O(1) | Single writev() syscall |
API Reference
Lifecycle
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferCreate | xRingBuffer xRingBufferCreate(size_t min_cap) | Create a ring buffer. Capacity rounded up to power of 2. | Not thread-safe |
xRingBufferDestroy | void xRingBufferDestroy(xRingBuffer rb) | Free the ring buffer. NULL is a no-op. | Not thread-safe |
xRingBufferReset | void xRingBufferReset(xRingBuffer rb) | Discard all data, keep memory. | Not thread-safe |
Query
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferLen | size_t xRingBufferLen(xRingBuffer rb) | Readable bytes. | Not thread-safe |
xRingBufferCap | size_t xRingBufferCap(xRingBuffer rb) | Total capacity. | Not thread-safe |
xRingBufferWritable | size_t xRingBufferWritable(xRingBuffer rb) | Writable bytes. | Not thread-safe |
xRingBufferEmpty | bool xRingBufferEmpty(xRingBuffer rb) | True if no readable data. | Not thread-safe |
xRingBufferFull | bool xRingBufferFull(xRingBuffer rb) | True if no writable space. | Not thread-safe |
Write
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferWrite | xErrno xRingBufferWrite(xRingBuffer rb, const void *data, size_t len) | Write bytes. Returns xErrno_NoMemory if full. | Not thread-safe |
Read
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferRead | size_t xRingBufferRead(xRingBuffer rb, void *out, size_t len) | Read and consume bytes. Returns actual count. | Not thread-safe |
xRingBufferPeek | size_t xRingBufferPeek(xRingBuffer rb, void *out, size_t len) | Read without consuming. | Not thread-safe |
xRingBufferDiscard | size_t xRingBufferDiscard(xRingBuffer rb, size_t n) | Discard bytes without copying. | Not thread-safe |
I/O Helpers
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferReadIov | int xRingBufferReadIov(xRingBuffer rb, struct iovec iov[2]) | Fill iovecs with readable regions (for writev). | Not thread-safe |
xRingBufferWriteIov | int xRingBufferWriteIov(xRingBuffer rb, struct iovec iov[2]) | Fill iovecs with writable regions (for readv). | Not thread-safe |
xRingBufferReadFd | ssize_t xRingBufferReadFd(xRingBuffer rb, int fd) | Read from fd using readv(). | Not thread-safe |
xRingBufferWriteFd | ssize_t xRingBufferWriteFd(xRingBuffer rb, int fd) | Write to fd using writev(). | Not thread-safe |
Usage Examples
Basic FIFO
#include <stdio.h>
#include <xbuf/ring.h>
int main(void) {
// Request 1000 bytes; actual capacity will be 1024 (next power of 2)
xRingBuffer rb = xRingBufferCreate(1000);
printf("Capacity: %zu\n", xRingBufferCap(rb)); // 1024
// Write data
const char *msg = "Hello, Ring!";
xRingBufferWrite(rb, msg, 12);
// Read data
char out[32];
size_t n = xRingBufferRead(rb, out, sizeof(out));
printf("Read %zu bytes: %.*s\n", n, (int)n, out);
xRingBufferDestroy(rb);
return 0;
}
Network Socket Buffer
#include <xbuf/ring.h>
void event_loop_handler(int sockfd) {
xRingBuffer rb = xRingBufferCreate(65536); // 64KB ring
// Read from socket into ring buffer
ssize_t n = xRingBufferReadFd(rb, sockfd);
if (n > 0) {
// Process data...
// Write processed data back
xRingBufferWriteFd(rb, sockfd);
}
xRingBufferDestroy(rb);
}
Use Cases
-
Fixed-Budget Network Buffers — When you need predictable memory usage per connection (e.g., 64KB per socket), the ring buffer provides a hard capacity limit.
-
Logging Ring Buffer — Capture the last N bytes of log output, automatically discarding old data when the buffer wraps.
-
Inter-Thread Communication — With external synchronization, a ring buffer can serve as a bounded channel between producer and consumer threads.
Best Practices
- Choose capacity carefully. The ring buffer never grows. If you write more than the capacity, the write fails. Size it for your worst-case scenario.
- Use scatter-gather I/O.
xRingBufferReadFd/WriteFdusereadv()/writev()to handle wrap-around in a single syscall, avoiding the need to linearize data. - Be aware of power-of-two rounding. Requesting 1000 bytes gives you 1024. Requesting 1025 gives you 2048. Plan accordingly.
- Check
xRingBufferWritable()before writing if you want to handle partial writes gracefully.
Comparison with Other Libraries
| Feature | xbuf ring.h | Linux kfifo | Boost circular_buffer | DPDK rte_ring |
|---|---|---|---|---|
| Capacity | Fixed, power-of-2 | Fixed, power-of-2 | Fixed, any size | Fixed, power-of-2 |
| Indexing | Bitmask | Bitmask | Modulo | Bitmask |
| Layout | FAM (single alloc) | Separate alloc | Heap array | Huge pages |
| Thread Safety | Not thread-safe | Single-producer/single-consumer | Not thread-safe | Multi-producer/multi-consumer |
| I/O Helpers | readv/writev | kfifo_to_user/kfifo_from_user | No | No (packet-oriented) |
| Language | C99 | C (kernel) | C++ | C |
Key Differentiator: xbuf's ring buffer combines the power-of-two bitmask optimization (like kfifo) with scatter-gather I/O helpers (readv/writev) in a single-allocation design. It's purpose-built for event-driven network programming where fixed memory budgets and efficient syscalls are essential.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbuf/ring_bench.cpp
| Benchmark | Size | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_Ring_WriteRead | 64 | 6.05 | 6.05 | 19.7 GiB/s |
BM_Ring_WriteRead | 256 | 16.8 | 16.8 | 28.4 GiB/s |
BM_Ring_WriteRead | 1,024 | 27.4 | 27.4 | 69.6 GiB/s |
BM_Ring_WriteRead | 4,096 | 99.2 | 99.2 | 76.9 GiB/s |
BM_Ring_Throughput | 4,096 | 225 | 225 | 17.0 GiB/s |
BM_Ring_Throughput | 16,384 | 806 | 806 | 18.9 GiB/s |
BM_Ring_Throughput | 65,536 | 3,198 | 3,198 | 19.1 GiB/s |
Key Observations:
- WriteRead (single write + read cycle) achieves up to ~77 GiB/s at 4KB chunks, demonstrating the efficiency of the bitmask-based wrap-around and
memcpyfor larger transfers. - Throughput (sustained writes until full) stabilizes at ~19 GiB/s regardless of capacity, showing consistent performance as the ring scales.
- The ring buffer's zero-overhead indexing (bitmask instead of modulo) keeps per-operation cost extremely low — just 6ns for a 64-byte write+read cycle.