xKit
Welcome to the xKit documentation. xKit is a collection of low-level C building blocks for event-driven, asynchronous programming on macOS and Linux. (Windows is on the roadmap but not a near-term priority).
- Designed and reviewed by Leo X.
- Coded by Codebuddy with claude-4.6-opus
Architecture Overview
graph TD
subgraph "Application Layer"
APP["User Application"]
end
subgraph "High-Level Modules"
XHTTP["xhttp<br/>HTTP Client & Server & WebSocket"]
XLOG["xlog<br/>Async Logging"]
end
subgraph "Networking Layer"
XNET["xnet<br/>URL / DNS / TLS Config / TCP"]
end
subgraph "Buffer Layer"
XBUF["xbuf<br/>Buffer Primitives"]
end
subgraph "Core Layer"
XBASE["xbase<br/>Core Primitives"]
end
APP --> XHTTP
APP --> XLOG
APP --> XNET
APP --> XBUF
APP --> XBASE
XHTTP --> XNET
XHTTP --> XBASE
XHTTP --> XBUF
XNET --> XBASE
XLOG --> XBASE
XBUF -->|"atomic.h"| XBASE
style XBASE fill:#50b86c,color:#fff
style XBUF fill:#4a90d9,color:#fff
style XNET fill:#e74c3c,color:#fff
style XHTTP fill:#f5a623,color:#fff
style XLOG fill:#9b59b6,color:#fff
Module Index
xbase — Core Primitives
The foundation of xKit. Provides event loop, timers, tasks, async sockets, memory management, and lock-free data structures.
| Sub-Module | Description |
|---|---|
| event.h | Cross-platform event loop — kqueue (macOS) / epoll (Linux) / poll (fallback) |
| timer.h | Monotonic timer with Push (thread-pool) and Poll (lock-free MPSC) fire modes |
| task.h | N:M task model — lightweight tasks multiplexed onto a thread pool |
| socket.h | Async socket abstraction with idle-timeout support |
| memory.h | Reference-counted allocation with vtable-driven lifecycle |
| error.h | Unified error codes and human-readable messages |
| heap.h | Min-heap with index tracking (used by timer subsystem) |
| mpsc.h | Lock-free multi-producer / single-consumer queue |
| atomic.h | Compiler-portable atomic operations (GCC/Clang builtins) |
| log.h | Per-thread callback-based logging with optional backtrace |
| backtrace.h | Platform-adaptive stack trace (libunwind > execinfo > stub) |
time.h | Time utilities: xMonoMs() (monotonic) and xWallMs() (wall-clock) |
xbuf — Buffer Primitives
Three buffer types for different I/O patterns — linear, ring, and block-chain.
| Sub-Module | Description |
|---|---|
| buf.h | Linear auto-growing byte buffer with 2× expansion |
| ring.h | Fixed-size ring buffer with power-of-2 mask indexing |
| io.h | Reference-counted block-chain I/O buffer with zero-copy split/cut |
xnet — Networking Primitives
Shared networking utilities: URL parser, async DNS resolver, and TLS configuration types used by higher-level modules.
| Sub-Module | Description |
|---|---|
| url.h | Lightweight URL parser with zero-copy component extraction |
| dns.h | Async DNS resolution via thread-pool offload |
| tls.h | Shared TLS configuration types (client & server) |
| tcp.h | Async TCP connection, connector & listener with optional TLS |
xhttp — Async HTTP Client & Server & WebSocket
Full-featured async HTTP framework: libcurl-powered client with SSE streaming, event-driven server with HTTP/1.1 & HTTP/2 (h2c), TLS support (OpenSSL / mbedTLS), and RFC 6455 WebSocket (server & client).
| Sub-Module | Description |
|---|---|
| client.h | Async HTTP client (GET / POST / PUT / DELETE / PATCH / HEAD) |
| sse.c | SSE streaming client with W3C-compliant event parsing |
| server.h | Event-driven HTTP server with HTTP/1.1 and HTTP/2 (h2c) |
| ws.h | RFC 6455 WebSocket server with handler-initiated upgrade |
| ws.h | RFC 6455 WebSocket client with async connect |
| transport.h | Pluggable TLS transport layer (OpenSSL / mbedTLS / plain) |
xlog — Async Logging
High-performance async logger with MPSC queue, three flush modes, and file rotation.
| Sub-Module | Description |
|---|---|
| logger.h | Async logger with Timer / Notify / Mixed modes and XLOG_* macros |
bench — End-to-End Benchmarks
End-to-end benchmark results comparing xKit against other frameworks in real-world scenarios.
| Benchmark | Description |
|---|---|
| HTTP/1.1 Server | xKit single-threaded HTTP/1.1 server vs Go net/http — GET/POST throughput and latency |
| HTTP/2 Server | xKit single-threaded HTTP/2 (h2c) server vs Go net/http h2c — GET/POST throughput and latency |
| HTTPS Server | xKit single-threaded HTTPS (TLS 1.3) server vs Go net/http — GET/POST throughput and latency |
Quick Navigation Guide
By Use Case
| I want to... | Start here |
|---|---|
| Build an event-driven server | xbase/event.h → xbase/socket.h |
| Schedule timers | xbase/timer.h |
| Run tasks on a thread pool | xbase/task.h |
| Make async HTTP requests | xhttp/client.h |
| Stream LLM API responses (SSE) | xhttp/sse.c |
| Build an HTTP server | xhttp/server.h |
| Add WebSocket server | xhttp/ws.h |
| Connect as WebSocket client | xhttp/ws.h |
| Parse a URL | xnet/url.h |
| Resolve DNS asynchronously | xnet/dns.h |
| Make async TCP connections | xnet/tcp.h |
| Build a TCP server | xnet/tcp.h |
| Configure TLS | xnet/tls.h |
| Enable TLS (HTTPS) | xhttp/transport.h |
| Add async logging | xlog/logger.h |
| Manage object lifecycles | xbase/memory.h |
| Choose the right buffer type | xbuf overview |
| Build a lock-free producer/consumer pipeline | xbase/mpsc.h |
| See micro-benchmark results | Each module doc has a Benchmark section (e.g. mpsc.h) |
| See HTTP server benchmarks | HTTP/1.1 · HTTP/2 · HTTPS |
By Dependency Level
Level 0 (no deps) : atomic.h, error.h, time.h
Level 1 (atomic only) : heap.h, mpsc.h
Level 2 (Level 0-1) : memory.h, log.h, backtrace.h, buf.h, ring.h
Level 3 (Level 0-2) : event.h, io.h, url.h, tls.h
Level 4 (event loop) : timer.h, task.h, socket.h, dns.h, tcp.h, logger.h, client.h, server.h, ws.h
Module Dependency Graph
graph BT
subgraph "Level 0"
ATOMIC["atomic.h"]
ERROR["error.h"]
TIME["time.h"]
end
subgraph "Level 1"
HEAP["heap.h"]
MPSC["mpsc.h"]
end
subgraph "Level 2"
MEMORY["memory.h"]
LOG["log.h"]
BT_["backtrace.h"]
BUF["buf.h"]
RING["ring.h"]
end
subgraph "Level 3"
EVENT["event.h"]
IO["io.h"]
URL["url.h"]
TLS_CONF["tls.h"]
end
subgraph "Level 4"
TIMER["timer.h"]
TASK["task.h"]
SOCKET["socket.h"]
DNS["dns.h"]
TCP["tcp.h"]
LOGGER["logger.h"]
CLIENT["client.h"]
SERVER["server.h"]
WS["ws.h"]
end
HEAP --> ATOMIC
MPSC --> ATOMIC
MEMORY --> ERROR
LOG --> BT_
IO --> ATOMIC
IO --> BUF
EVENT --> HEAP
EVENT --> MPSC
EVENT --> TIME
TIMER --> EVENT
TASK --> EVENT
SOCKET --> EVENT
DNS --> EVENT
TCP --> EVENT
TCP --> DNS
TCP --> SOCKET
TCP --> TLS_CONF
LOGGER --> EVENT
LOGGER --> MPSC
LOGGER --> LOG
CLIENT --> EVENT
CLIENT --> BUF
CLIENT --> URL
CLIENT --> DNS
CLIENT --> TLS_CONF
SERVER --> SOCKET
SERVER --> BUF
SERVER --> TLS_CONF
WS --> SERVER
WS --> URL
style EVENT fill:#50b86c,color:#fff
style URL fill:#e74c3c,color:#fff
style DNS fill:#e74c3c,color:#fff
style TCP fill:#e74c3c,color:#fff
style TLS_CONF fill:#e74c3c,color:#fff
style CLIENT fill:#f5a623,color:#fff
style SERVER fill:#f5a623,color:#fff
style WS fill:#f5a623,color:#fff
style LOGGER fill:#9b59b6,color:#fff
Build & Test
# Build
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build --parallel
# Test
ctest --test-dir build --output-on-failure --parallel 4
See the project README for full build instructions, prerequisites, and container-based Linux testing.
Benchmark
Micro-benchmark results are included in each module's documentation page (see the Benchmark section at the bottom of each page, e.g. mpsc.h, buf.h).
End-to-end benchmarks:
| Benchmark | Description |
|---|---|
| HTTP/1.1 Server | xKit vs Go net/http — 152K req/s single-threaded, +15~60% faster across all scenarios |
| HTTP/2 Server | xKit vs Go h2c — single-threaded HTTP/2 (h2c) throughput comparison |
| HTTPS Server | xKit vs Go HTTPS — single-threaded TLS 1.3 throughput comparison |
License
MIT © 2025-present Leo X. and xKit contributors
Modules
xKit is organized into five modules, layered from low-level core primitives up to high-level async networking.
┌─────────────────────────────────────────────┐
│ Application Layer │
├──────────────────────┬──────────────────────┤
│ xhttp │ xlog │
│ HTTP Client/Server │ Async Logging │
│ WebSocket │ │
├──────────────────────┴──────────────────────┤
│ xnet — URL / DNS / TCP / TLS Config │
├─────────────────────────────────────────────┤
│ xbuf — Linear / Ring / Block-Chain Buffer │
├─────────────────────────────────────────────┤
│ xbase — Event Loop / Timer / Task / │
│ Memory / Atomic / MPSC Queue │
└─────────────────────────────────────────────┘
Overview
| Module | Description |
|---|---|
| xbase | Core primitives — event loop, timers, tasks, async sockets, memory, lock-free data structures |
| xbuf | Buffer primitives — linear, ring, and block-chain I/O buffers |
| xnet | Networking primitives — URL parser, async DNS resolver, TCP, shared TLS configuration types |
| xhttp | Async HTTP client & server — libcurl multi-socket client with SSE streaming, HTTP/1.1 & HTTP/2 async server with TLS, WebSocket server & client |
| xlog | Async logging — MPSC queue, timer/pipe flush, log rotation |
Dependency Order
Level 0 (no deps) : atomic.h, error.h, time.h
Level 1 (atomic only) : heap.h, mpsc.h
Level 2 (Level 0-1) : memory.h, log.h, backtrace.h, buf.h, ring.h
Level 3 (Level 0-2) : event.h, io.h, url.h, tls.h
Level 4 (event loop) : timer.h, task.h, socket.h, dns.h, tcp.h, logger.h, client.h, server.h, ws.h
xbase — Event-Driven Async Foundation
Introduction
xbase is the foundational module of xKit, providing the core primitives for building event-driven, asynchronous C applications on macOS and Linux. It delivers a cross-platform event loop, monotonic timers, an N:M task model (thread pool), async sockets, reference-counted memory management, lock-free data structures, and essential utilities — all in a minimal, zero-dependency C99 package.
xbase is designed to be the "kernel" that higher-level xKit modules (xbuf, xhttp, xlog) build upon. Every I/O-bound or timer-driven feature in xKit ultimately relies on xbase's event loop and concurrency primitives.
Design Philosophy
-
Edge-Triggered by Default — The event loop operates in edge-triggered mode across all backends (kqueue, epoll, poll), encouraging callers to drain file descriptors completely. This yields higher throughput and fewer spurious wakeups compared to level-triggered designs.
-
Layered Abstraction — Low-level primitives (atomic, mpsc, heap) are composed into mid-level services (timer, task) which are then integrated into the high-level event loop. Each layer is independently usable.
-
Zero Allocation in the Hot Path — Data structures like the MPSC queue and min-heap are designed to avoid dynamic allocation during normal operation. Memory is pre-allocated or embedded in user structs.
-
Thread-Safety Where It Matters — APIs that are expected to be called cross-thread (e.g.,
xEventWake,xTimerSubmitAfter,xMpscPush) are explicitly designed to be thread-safe. Single-threaded APIs are documented as such. -
vtable-Driven Lifecycle — The memory module uses a virtual table pattern (ctor/dtor/retain/release) to provide reference-counted object management in pure C, inspired by Objective-C's retain/release model.
-
Platform Adaptation at Build Time — Platform-specific code (kqueue vs. epoll, libunwind vs. execinfo) is selected via compile-time macros, keeping runtime overhead at zero.
Architecture
graph TD
subgraph "High-Level Services"
EVENT["event.h<br/>Event Loop"]
TIMER["timer.h<br/>Monotonic Timer"]
TASK["task.h<br/>N:M Task Model"]
SOCKET["socket.h<br/>Async Socket"]
end
subgraph "Infrastructure"
MEMORY["memory.h<br/>Ref-Counted Memory"]
LOG["log.h<br/>Thread-Local Log"]
BACKTRACE["backtrace.h<br/>Stack Backtrace"]
ERROR["error.h<br/>Error Codes"]
TIME["time.h<br/>Time Utilities"]
end
subgraph "Data Structures & Concurrency"
HEAP["heap.h<br/>Min-Heap"]
MPSC["mpsc.h<br/>Lock-Free MPSC Queue"]
ATOMIC["atomic.h<br/>Atomic Operations"]
end
EVENT -->|"registers timers"| TIMER
EVENT -->|"offloads work"| TASK
EVENT -->|"wraps fd"| SOCKET
SOCKET -->|"monitors I/O"| EVENT
SOCKET -->|"idle timeout"| EVENT
TIMER -->|"schedules entries"| HEAP
TIMER -->|"poll-mode queue"| MPSC
TIMER -->|"push-mode dispatch"| TASK
TIMER -->|"reads clock"| TIME
MPSC -->|"CAS operations"| ATOMIC
MEMORY -->|"atomic refcount"| ATOMIC
LOG -->|"fatal backtrace"| BACKTRACE
LOG -->|"error formatting"| ERROR
EVENT -->|"reads clock"| TIME
style EVENT fill:#4a90d9,color:#fff
style TIMER fill:#4a90d9,color:#fff
style TASK fill:#4a90d9,color:#fff
style SOCKET fill:#4a90d9,color:#fff
style MEMORY fill:#50b86c,color:#fff
style LOG fill:#50b86c,color:#fff
style BACKTRACE fill:#50b86c,color:#fff
style ERROR fill:#50b86c,color:#fff
style TIME fill:#50b86c,color:#fff
style HEAP fill:#f5a623,color:#fff
style MPSC fill:#f5a623,color:#fff
style ATOMIC fill:#f5a623,color:#fff
Sub-Module Overview
| Header | Document | Description |
|---|---|---|
event.h | event.md | Cross-platform event loop (edge-triggered) — kqueue / epoll / poll backends with built-in timer and thread-pool integration |
timer.h | timer.md | Monotonic timer with push (thread-pool) and poll (lock-free MPSC) fire modes |
task.h | task.md | N:M task model — lightweight tasks multiplexed onto a configurable thread pool |
socket.h | socket.md | Async socket abstraction with idle-timeout support over xEventLoop |
memory.h | memory.md | Reference-counted allocation with vtable-driven lifecycle (ctor/dtor/retain/release) |
log.h | log.md | Per-thread callback-based logging with optional backtrace on fatal |
backtrace.h | backtrace.md | Platform-adaptive stack trace capture (libunwind > execinfo > stub) |
error.h | error.md | Unified error codes (xErrno) and human-readable messages |
heap.h | heap.md | Generic min-heap with O(log n) insert/remove, used internally by the timer subsystem |
mpsc.h | mpsc.md | Lock-free multi-producer / single-consumer intrusive queue |
atomic.h | atomic.md | Compiler-portable atomic operations (GCC/Clang __atomic builtins) |
io.h | io.md | Abstract I/O interfaces (Reader, Writer, Seeker, Closer) with convenience helpers (xReadFull, xReadAll, xWritev, etc.) |
time.h | — | Time utilities: xMonoMs() (monotonic) and xWallMs() (wall-clock) in milliseconds |
How to Choose
| I need to… | Use |
|---|---|
| React to I/O readiness on file descriptors | event.h — register fds and get edge-triggered callbacks |
| Schedule delayed or periodic work | timer.h — standalone timer, or use xEventLoopTimerAfter() for event-loop-integrated timers |
| Run CPU-bound work off the main thread | task.h — submit to a thread pool, optionally collect results |
| Manage non-blocking TCP/UDP connections | socket.h — wraps socket + event loop + idle timeout |
| Allocate objects with automatic cleanup | memory.h — XMALLOC(T) + xRetain/xRelease |
| Report errors from library internals | log.h — thread-local callback, or stderr fallback |
| Capture a stack trace for debugging | backtrace.h — xBacktrace() fills a buffer |
| Handle error codes uniformly | error.h — xErrno enum + xstrerror() |
| Build a priority queue | heap.h — generic min-heap with index tracking |
| Pass messages between threads lock-free | mpsc.h — intrusive MPSC queue |
| Perform atomic read-modify-write | atomic.h — macro wrappers over compiler builtins |
| Get current time in milliseconds | time.h — xMonoMs() for elapsed time, xWallMs() for wall-clock |
| Read/write through abstract I/O interfaces | io.h — xReader / xWriter + helpers like xReadFull, xReadAll |
Quick Start
A minimal example that creates an event loop, schedules a one-shot timer, and runs until the timer fires:
#include <stdio.h>
#include <xbase/event.h>
static void on_timer(void *arg) {
printf("Timer fired!\n");
xEventLoopStop((xEventLoop)arg);
}
int main(void) {
// Create an event loop
xEventLoop loop = xEventLoopCreate();
if (!loop) return 1;
// Schedule a timer to fire after 1 second
xEventLoopTimerAfter(loop, on_timer, loop, 1000);
// Run the event loop (blocks until xEventLoopStop is called)
xEventLoopRun(loop);
// Clean up
xEventLoopDestroy(loop);
return 0;
}
Compile with:
gcc -o example example.c -I/path/to/xkit -lxbase -lpthread
Relationship with Other Modules
graph LR
XBASE["xbase"]
XBUF["xbuf"]
XHTTP["xhttp"]
XLOG["xlog"]
XHTTP -->|"event loop + timer"| XBASE
XHTTP -->|"I/O buffers"| XBUF
XLOG -->|"event loop + MPSC queue"| XBASE
XBUF -.->|"no dependency"| XBASE
XNET["xnet"]
XNET -->|"event loop + thread pool + atomic"| XBASE
XHTTP -->|"URL + DNS + TLS config"| XNET
style XBASE fill:#4a90d9,color:#fff
style XBUF fill:#50b86c,color:#fff
style XHTTP fill:#f5a623,color:#fff
style XLOG fill:#e74c3c,color:#fff
style XNET fill:#e74c3c,color:#fff
- xbuf — Buffer module.
xIOBufferuses xbase'satomic.hfor lock-free block pool management. xhttp uses both xbase and xbuf together. - xhttp — The async HTTP client is built on top of xbase's event loop (
xEventLoop) and timer infrastructure, and uses xbuf for response buffering. - xnet — The networking primitives module. The async DNS resolver uses xbase's event loop for thread-pool offload (
xEventLoopSubmit) andatomic.hfor the cancellation flag. - xlog — The async logger uses xbase's event loop for timer-based flushing and the MPSC queue for lock-free log message passing from application threads to the logger thread.
event.h — Cross-Platform Event Loop
Introduction
event.h provides a cross-platform, edge-triggered event loop abstraction for I/O multiplexing. It unifies three OS-specific backends — kqueue (macOS/BSD), epoll (Linux), and poll (POSIX fallback) — behind a single API. The event loop is the central coordination point in xbase: it monitors file descriptors for readiness, dispatches timer callbacks, offloads CPU-bound work to thread pools, and watches for POSIX signals — all from a single thread.
Design Philosophy
-
Edge-Triggered Everywhere — All three backends operate in edge-triggered mode. kqueue uses
EV_CLEAR, epoll usesEPOLLET, and poll emulates edge-triggered behavior by clearing the event mask after each notification (requiring the caller to re-arm viaxEventMod()). This design encourages callers to drain fds completely, reducing spurious wakeups. -
Backend Selection at Compile Time — The backend is chosen via preprocessor macros (
XK_HAS_KQUEUE,XK_HAS_EPOLL), with poll as the universal fallback. This means zero runtime dispatch overhead. -
Integrated Timer Heap — Rather than requiring a separate timer facility, the event loop embeds a min-heap of timer entries.
xEventWait()automatically adjusts its timeout to fire the earliest timer, providing sub-millisecond timer resolution without a dedicated timer thread. -
Thread-Pool Offload —
xEventLoopSubmit()bridges the event loop and the task system: CPU-bound work runs on a worker thread, and the completion callback is dispatched on the event loop thread via a lock-free MPSC queue + wake pipe, ensuring single-threaded callback semantics. -
Self-Pipe Trick for Signals — On epoll and poll backends, signal delivery uses the self-pipe trick (a
sigactionhandler writes to a pipe) rather thansignalfd, avoiding the fragile requirement of blocking signals in every thread. On kqueue,EVFILT_SIGNALis used natively.
Architecture
graph TD
subgraph "Event Loop (single thread)"
WAIT["xEventWait()"]
DISPATCH["Dispatch I/O callbacks"]
TIMERS["Fire expired timers"]
DONE["Drain done-queue"]
SWEEP["Sweep deleted sources"]
end
subgraph "Backend (compile-time)"
KQ["kqueue"]
EP["epoll"]
PO["poll"]
end
subgraph "Cross-Thread"
WAKE["Wake Pipe"]
MPSC_Q["MPSC Done Queue"]
WORKER["Worker Thread Pool"]
end
WAIT --> KQ
WAIT --> EP
WAIT --> PO
KQ --> DISPATCH
EP --> DISPATCH
PO --> DISPATCH
DISPATCH --> TIMERS
TIMERS --> DONE
DONE --> SWEEP
WORKER -->|"push result"| MPSC_Q
MPSC_Q -->|"wake"| WAKE
WAKE -->|"drain"| DONE
style WAIT fill:#4a90d9,color:#fff
style DISPATCH fill:#4a90d9,color:#fff
style TIMERS fill:#f5a623,color:#fff
style DONE fill:#50b86c,color:#fff
Event Loop Lifecycle
sequenceDiagram
participant App
participant EL as xEventLoop
participant Backend as kqueue / epoll / poll
participant Timer as Timer Heap
App->>EL: xEventLoopCreate()
App->>EL: xEventAdd(fd, mask, callback)
App->>EL: xEventLoopTimerAfter(fn, 1000ms)
App->>EL: xEventLoopRun()
loop Main Loop
EL->>Timer: Check earliest deadline
Timer-->>EL: timeout = min(user_timeout, timer_deadline)
EL->>Backend: wait(timeout)
Backend-->>EL: ready events
EL->>App: callback(fd, mask)
EL->>Timer: Pop & fire expired timers
EL->>EL: Sweep deleted sources
end
App->>EL: xEventLoopStop()
App->>EL: xEventLoopDestroy()
Implementation Details
Backend Architecture
Each backend is implemented in a separate .c file that provides the full public API:
| File | Backend | Trigger Mode | Selection |
|---|---|---|---|
event_kqueue.c | kqueue | EV_CLEAR (native edge) | #ifdef XK_HAS_KQUEUE |
event_epoll.c | epoll | EPOLLET (native edge) | #ifdef XK_HAS_EPOLL |
event_poll.c | poll(2) | Emulated edge (mask cleared after dispatch) | Fallback |
All backends share a common base structure (struct xEventLoop_) defined in event_private.h, which contains:
- A dynamic source array with deferred deletion (sweep after dispatch)
- A wake pipe (non-blocking) for cross-thread wakeup
- A min-heap for builtin timers (protected by
timer_mumutex) - A lock-free MPSC done-queue for offload completion callbacks
- Signal watch slots (up to
XK_SIGNAL_MAX = 64)
Deferred Source Deletion
When xEventDel() is called during a callback dispatch, the source is marked deleted = 1 rather than freed immediately. After the dispatch batch completes, source_array_sweep() frees all deleted sources. This prevents use-after-free when multiple events reference the same source in a single xEventWait() call.
Wake Pipe
A non-blocking pipe (wake_rfd / wake_wfd) is registered with the backend. xEventWake() writes a single byte to the write end; the event loop drains the read end and processes the done-queue. Multiple wakes before the next xEventWait() are coalesced (EAGAIN on a full pipe is treated as success).
Timer Integration
Builtin timers are stored in a min-heap inside the event loop. Before each xEventWait() call, the effective timeout is clamped to the earliest timer deadline. After I/O dispatch, expired timers are popped and fired. Timer operations (xEventLoopTimerAfter, xEventLoopTimerAt, xEventLoopTimerCancel) are thread-safe, protected by timer_mu.
Signal Handling
| Backend | Mechanism |
|---|---|
| kqueue | EVFILT_SIGNAL with EV_CLEAR — native kernel support |
| epoll | Self-pipe trick: sigaction handler writes to a per-signal pipe |
| poll | Self-pipe trick: same as epoll |
The self-pipe approach avoids signalfd's requirement to block signals in all threads, which is fragile in the presence of third-party libraries and test frameworks.
API Reference
Types
| Type | Description |
|---|---|
xEventMask | Bitmask enum: xEvent_Read (1), xEvent_Write (2), xEvent_Timeout (4) |
xEventFunc | void (*)(int fd, xEventMask mask, void *arg) — I/O callback |
xEventTimerFunc | void (*)(void *arg) — Timer callback |
xEventSignalFunc | void (*)(int signo, void *arg) — Signal callback |
xEventDoneFunc | void (*)(void *arg, void *result) — Offload completion callback |
xEventLoop | Opaque handle to an event loop |
xEventSource | Opaque handle to a registered event source |
xEventTimer | Opaque handle to a builtin timer |
Functions
Lifecycle
| Function | Signature | Thread Safety |
|---|---|---|
xEventLoopCreate | xEventLoop xEventLoopCreate(void) | Not thread-safe |
xEventLoopCreateWithGroup | xEventLoop xEventLoopCreateWithGroup(xTaskGroup group) | Not thread-safe |
xEventLoopDestroy | void xEventLoopDestroy(xEventLoop loop) | Not thread-safe |
xEventLoopRun | void xEventLoopRun(xEventLoop loop) | Not thread-safe (call from one thread) |
xEventLoopStop | void xEventLoopStop(xEventLoop loop) | Thread-safe |
I/O Sources
| Function | Signature | Thread Safety |
|---|---|---|
xEventAdd | xEventSource xEventAdd(xEventLoop loop, int fd, xEventMask mask, xEventFunc fn, void *arg) | Not thread-safe |
xEventMod | xErrno xEventMod(xEventLoop loop, xEventSource src, xEventMask mask) | Not thread-safe |
xEventDel | xErrno xEventDel(xEventLoop loop, xEventSource src) | Not thread-safe |
xEventWait | int xEventWait(xEventLoop loop, int timeout_ms) | Not thread-safe |
Timers
| Function | Signature | Thread Safety |
|---|---|---|
xEventLoopTimerAfter | xEventTimer xEventLoopTimerAfter(xEventLoop loop, xEventTimerFunc fn, void *arg, uint64_t delay_ms) | Thread-safe |
xEventLoopTimerAt | xEventTimer xEventLoopTimerAt(xEventLoop loop, xEventTimerFunc fn, void *arg, uint64_t abs_ms) | Thread-safe |
xEventLoopTimerCancel | xErrno xEventLoopTimerCancel(xEventLoop loop, xEventTimer timer) | Thread-safe |
Cross-Thread
| Function | Signature | Thread Safety |
|---|---|---|
xEventWake | xErrno xEventWake(xEventLoop loop) | Thread-safe (signal-handler-safe) |
xEventLoopSubmit | xErrno xEventLoopSubmit(xEventLoop loop, xTaskGroup group, xTaskFunc work_fn, xEventDoneFunc done_fn, void *arg) | Thread-safe |
Signal
| Function | Signature | Thread Safety |
|---|---|---|
xEventLoopSignalWatch | xErrno xEventLoopSignalWatch(xEventLoop loop, int signo, xEventSignalFunc fn, void *arg) | Not thread-safe |
Deprecated
| Function | Signature | Replacement |
|---|---|---|
xEventLoopNowMs | uint64_t xEventLoopNowMs(void) | xMonoMs() from <xbase/time.h> |
Usage Examples
Basic Event Loop with Timer
#include <stdio.h>
#include <xbase/event.h>
static void on_timer(void *arg) {
printf("Timer fired!\n");
xEventLoopStop((xEventLoop)arg);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
if (!loop) return 1;
// Fire after 500ms
xEventLoopTimerAfter(loop, on_timer, loop, 500);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Monitoring a File Descriptor
#include <stdio.h>
#include <unistd.h>
#include <xbase/event.h>
static void on_readable(int fd, xEventMask mask, void *arg) {
char buf[1024];
ssize_t n;
// Edge-triggered: must drain completely
while ((n = read(fd, buf, sizeof(buf))) > 0) {
fwrite(buf, 1, (size_t)n, stdout);
}
(void)mask;
(void)arg;
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
// Monitor stdin for readability
xEventAdd(loop, STDIN_FILENO, xEvent_Read, on_readable, NULL);
// Run for up to 10 seconds
xEventLoopTimerAfter(loop, (xEventTimerFunc)xEventLoopStop, loop, 10000);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Offloading Work to a Thread Pool
#include <stdio.h>
#include <xbase/event.h>
static void *heavy_work(void *arg) {
// Runs on a worker thread
int *val = (int *)arg;
*val *= 2;
return val;
}
static void on_done(void *arg, void *result) {
// Runs on the event loop thread
int *val = (int *)result;
printf("Result: %d\n", *val);
(void)arg;
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
int value = 21;
xEventLoopSubmit(loop, NULL, heavy_work, on_done, &value);
// Run briefly to process the completion
xEventLoopTimerAfter(loop, (xEventTimerFunc)xEventLoopStop, loop, 1000);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Use Cases
-
Network Servers — Register listening sockets and accepted connections with the event loop. Use edge-triggered callbacks to read/write data without blocking. Combine with
xSocketfor idle-timeout support. -
Timer-Driven State Machines — Use
xEventLoopTimerAfter()to schedule state transitions, retries, or heartbeat checks. The timer is integrated into the event loop, so no separate timer thread is needed. -
Hybrid I/O + CPU Workloads — Use
xEventLoopSubmit()to offload CPU-intensive parsing or compression to a thread pool, then process results on the event loop thread where I/O state is safely accessible.
Best Practices
- Always drain fds in edge-triggered mode. Read/write until
EAGAINin every callback. Missing data means you won't be notified again until new data arrives. - Never block in callbacks. The event loop is single-threaded; a blocking call stalls all I/O and timer processing. Offload heavy work via
xEventLoopSubmit(). - Use
xEventLoopRun()for the main loop. It handles timer dispatch and stop-flag checking automatically. Only usexEventWait()directly if you need custom loop logic. - Cancel timers you no longer need. Uncancelled timers hold memory until they fire. Use
xEventLoopTimerCancel()to free them early. - Be aware of the poll backend's edge emulation. On systems without kqueue or epoll, the poll backend clears the event mask after dispatch. You must call
xEventMod()to re-arm.
Comparison with Other Libraries
| Feature | xbase event.h | libevent | libev | libuv |
|---|---|---|---|---|
| Trigger Mode | Edge-triggered only | Level (default), edge optional | Level + edge | Level-triggered |
| Backends | kqueue, epoll, poll | kqueue, epoll, poll, select, devpoll, IOCP | kqueue, epoll, poll, select, port | kqueue, epoll, poll, IOCP |
| Timer Integration | Built-in min-heap | Separate timer API | Built-in | Built-in |
| Thread Pool | Built-in (xEventLoopSubmit) | None (external) | None (external) | Built-in (uv_queue_work) |
| Signal Handling | Self-pipe / EVFILT_SIGNAL | evsignal | ev_signal | uv_signal |
| API Style | Opaque handles, C99 | Struct-based, C89 | Struct-based, C89 | Handle-based, C99 |
| Binary Size | ~15 KB | ~200 KB | ~50 KB | ~500 KB |
| Dependencies | None | None | None | None |
| Windows Support | Not yet | Yes (IOCP) | Yes (select) | Yes (IOCP) |
| Design Goal | Minimal building block | Full-featured framework | Minimal + performant | Cross-platform framework |
Key Differentiator: xbase's event loop is intentionally minimal — it provides the essential primitives (I/O, timers, signals, thread-pool offload) without buffered I/O, DNS resolution, or HTTP parsing. This makes it ideal as a foundation layer for higher-level libraries (like xhttp) rather than a standalone application framework.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2), kqueue backend. Source:xbase/event_bench.cpp
| Benchmark | Time (ns) | CPU (ns) | Iterations |
|---|---|---|---|
BM_EventLoop_CreateDestroy | 2,663 | 2,663 | 264,113 |
BM_EventLoop_WakeLatency | 854 | 854 | 814,901 |
BM_EventLoop_PipeAddDel | 1,107 | 1,107 | 627,088 |
Key Observations:
- Create/Destroy takes ~2.7µs, reflecting the cost of kqueue fd creation and internal structure allocation. Acceptable for long-lived event loops.
- Wake latency is ~854ns per wake+wait cycle, demonstrating efficient cross-thread notification via the internal wake mechanism.
- Add/Del cycle (register + unregister a pipe fd) takes ~1.1µs, showing low overhead for dynamic fd management — important for short-lived connections.
timer.h — Monotonic Timer
Introduction
timer.h provides a standalone monotonic timer that schedules callbacks to fire after a delay or at an absolute time. It supports two fire modes — Push mode (dispatch to a thread pool) and Poll mode (enqueue to a lock-free MPSC queue for caller-driven execution) — making it suitable for both multi-threaded and single-threaded architectures.
Note: For timers integrated directly into an event loop, see
xEventLoopTimerAfter()/xEventLoopTimerAt()inevent.h. The standalonetimer.his useful when you need timers without an event loop, or when you want explicit control over which thread executes the callbacks.
Design Philosophy
-
Dual Fire Modes — Push mode hands expired callbacks to a thread pool for concurrent execution; Poll mode queues them for the caller to drain synchronously. This lets latency-sensitive code (e.g., an event loop) avoid thread-switch overhead by polling, while background services can use push mode for simplicity.
-
Dedicated Timer Thread — Each
xTimerinstance spawns one background thread that sleeps on a condition variable, waking only when the earliest deadline arrives or a new entry is submitted. This avoids busy-waiting and keeps CPU usage near zero when idle. -
Min-Heap for O(log n) Scheduling — Timer entries are stored in a min-heap ordered by deadline. Insert, cancel, and fire-next are all O(log n). The heap is provided by
heap.h. -
Lock-Free Poll Queue — In poll mode, expired entries are pushed onto an intrusive MPSC queue (
mpsc.h) without holding the mutex, minimizing contention between the timer thread and the polling thread.
Architecture
sequenceDiagram
participant App
participant Timer as xTimer
participant Thread as Timer Thread
participant Heap as Min-Heap
participant Queue as MPSC Queue
App->>Timer: xTimerCreate(group)
Timer->>Thread: spawn
App->>Timer: xTimerSubmitAfter(fn, 1000ms)
Timer->>Heap: push(entry)
Timer->>Thread: signal(cond)
Thread->>Heap: peek → deadline
Note over Thread: sleep until deadline
Thread->>Heap: pop(entry)
alt Push Mode
Thread->>App: xTaskSubmit(fn)
else Poll Mode
Thread->>Queue: xMpscPush(entry)
App->>Queue: xTimerPoll()
Queue-->>App: callback(arg)
end
Implementation Details
Internal Structure
struct xTimerTask_ {
xMpsc node; // Intrusive MPSC node (poll mode)
uint64_t deadline; // Absolute expiry time (CLOCK_MONOTONIC, ms)
xTimerFunc fn; // User callback
void *arg; // User argument
size_t heap_idx; // Position in min-heap (TIMER_INVALID_IDX when not in heap)
int cancelled; // Set to 1 under mutex before removal
};
struct xTimer_ {
xHeap heap; // Min-heap ordered by deadline
xTaskGroup group; // Non-NULL → push mode; NULL → poll mode
xMpsc *mq_head; // Poll-mode MPSC queue head
xMpsc *mq_tail; // Poll-mode MPSC queue tail
pthread_t thread; // Background timer thread
pthread_mutex_t mu; // Protects heap and stopped flag
pthread_cond_t cond; // Wakes timer thread on new entry or stop
int stopped; // Shutdown flag
};
Timer Thread Loop
The background thread follows this algorithm:
- Wait — If the heap is empty, block on
pthread_cond_wait(). - Check top — Peek at the minimum-deadline entry.
- Fire or sleep — If
deadline ≤ now, pop and fire. Otherwise,pthread_cond_timedwait()until the deadline or a new signal. - Repeat until
stoppedis set.
When a new entry is submitted, pthread_cond_signal() wakes the thread so it can re-evaluate whether the new entry has an earlier deadline.
Push vs. Poll Mode
graph LR
subgraph "Push Mode (group != NULL)"
HEAP_P["Min-Heap"] -->|"pop expired"| FIRE_P["fire()"]
FIRE_P -->|"xTaskSubmit"| POOL["Thread Pool"]
POOL -->|"execute"| CB_P["callback(arg)"]
end
subgraph "Poll Mode (group == NULL)"
HEAP_Q["Min-Heap"] -->|"pop expired"| FIRE_Q["fire()"]
FIRE_Q -->|"xMpscPush"| MPSC["MPSC Queue"]
MPSC -->|"xTimerPoll()"| CB_Q["callback(arg)"]
end
style POOL fill:#4a90d9,color:#fff
style MPSC fill:#f5a623,color:#fff
Cancellation
xTimerCancel() acquires the mutex, checks if the entry is still in the heap (not already fired or cancelled), removes it via xHeapRemove(), marks it cancelled, and frees the memory. If the entry has already fired, xErrno_Cancelled is returned.
Memory Ownership
- Push mode: The timer thread transfers ownership of the
xTimerTask_to the worker thread viaxTaskSubmit(). The worker frees it after executing the callback. - Poll mode: The timer thread pushes the entry to the MPSC queue.
xTimerPoll()pops and frees each entry after executing its callback. - Cancellation: The caller frees the entry immediately.
- Destroy: Remaining heap entries and poll-queue entries are freed without firing.
API Reference
Types
| Type | Description |
|---|---|
xTimerFunc | void (*)(void *arg) — Timer callback signature |
xTimer | Opaque handle to a timer instance |
xTimerTask | Opaque handle to a submitted timer entry |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xTimerCreate | xTimer xTimerCreate(xTaskGroup g) | Create a timer. g != NULL → push mode, g == NULL → poll mode. | Not thread-safe |
xTimerDestroy | void xTimerDestroy(xTimer t) | Stop the timer thread and free all resources. Pending entries are discarded. | Not thread-safe |
xTimerSubmitAfter | xTimerTask xTimerSubmitAfter(xTimer t, xTimerFunc fn, void *arg, uint64_t delay_ms) | Schedule a callback after a relative delay. | Thread-safe |
xTimerSubmitAt | xTimerTask xTimerSubmitAt(xTimer t, xTimerFunc fn, void *arg, uint64_t abs_ms) | Schedule a callback at an absolute monotonic time. | Thread-safe |
xTimerCancel | xErrno xTimerCancel(xTimer t, xTimerTask task) | Cancel a pending entry. Returns xErrno_Ok if cancelled, xErrno_Cancelled if already fired. | Thread-safe |
xTimerPoll | int xTimerPoll(xTimer t) | Execute all due callbacks (poll mode only). Returns count. No-op in push mode. | Not thread-safe |
xTimerNowMs | uint64_t xTimerNowMs(void) | Deprecated. Use xMonoMs() from <xbase/time.h>. | Thread-safe |
Usage Examples
Push Mode (Thread Pool Dispatch)
#include <stdio.h>
#include <xbase/timer.h>
#include <xbase/task.h>
#include <unistd.h>
static void on_timeout(void *arg) {
printf("Timer fired on worker thread! arg=%p\n", arg);
}
int main(void) {
xTaskGroup group = xTaskGroupCreate(NULL);
xTimer timer = xTimerCreate(group);
// Fire after 500ms on a worker thread
xTimerSubmitAfter(timer, on_timeout, NULL, 500);
sleep(1); // Wait for timer to fire
xTimerDestroy(timer);
xTaskGroupDestroy(group);
return 0;
}
Poll Mode (Event Loop Integration)
#include <stdio.h>
#include <xbase/timer.h>
#include <xbase/time.h>
static void on_timeout(void *arg) {
int *count = (int *)arg;
printf("Timer #%d fired on caller thread\n", ++(*count));
}
int main(void) {
xTimer timer = xTimerCreate(NULL); // Poll mode
int count = 0;
// Schedule 3 timers
xTimerSubmitAfter(timer, on_timeout, &count, 100);
xTimerSubmitAfter(timer, on_timeout, &count, 200);
xTimerSubmitAfter(timer, on_timeout, &count, 300);
// Poll loop
uint64_t start = xMonoMs();
while (xMonoMs() - start < 500) {
int n = xTimerPoll(timer);
if (n > 0) printf(" Polled %d timer(s)\n", n);
usleep(10000); // 10ms
}
xTimerDestroy(timer);
return 0;
}
Use Cases
-
Event Loop Timer Backend — The event loop's builtin timers (
xEventLoopTimerAfter) use the same min-heap approach internally. Use standalonexTimerwhen you need timers independent of an event loop. -
Retry / Backoff Logic — Schedule retries with exponential backoff using
xTimerSubmitAfter(). Cancel pending retries withxTimerCancel()when a response arrives. -
Periodic Health Checks — In poll mode, integrate
xTimerPoll()into your main loop to execute periodic health checks without spawning additional threads.
Best Practices
- Choose the right mode. Use push mode when callbacks are independent and can run concurrently. Use poll mode when callbacks must run on a specific thread (e.g., the event loop thread) or when you want to avoid thread-switch latency.
- Don't use the handle after fire or cancel. Once a timer entry fires or is cancelled, the memory is freed. Accessing the handle is undefined behavior.
- Destroy before the task group. If using push mode, destroy the timer before destroying the task group to ensure all in-flight callbacks complete.
- Prefer
xEventLoopTimerAfter()when using an event loop. It avoids the overhead of a separate timer thread and integrates seamlessly with I/O dispatch.
Comparison with Other Libraries
| Feature | xbase timer.h | timerfd (Linux) | POSIX timer (timer_create) | libuv uv_timer |
|---|---|---|---|---|
| Platform | macOS + Linux | Linux only | POSIX (varies) | Cross-platform |
| Fire Mode | Push (thread pool) or Poll (MPSC) | fd-based (integrates with epoll) | Signal or thread | Event loop callback |
| Resolution | Millisecond (CLOCK_MONOTONIC) | Nanosecond | Nanosecond | Millisecond |
| Data Structure | Min-heap (O(log n)) | Kernel-managed | Kernel-managed | Min-heap |
| Thread Safety | Submit/Cancel are thread-safe | fd operations are thread-safe | Varies | Not thread-safe |
| Cancellation | O(log n) via heap index | timerfd_settime(0) | timer_delete() | uv_timer_stop() |
| Overhead | 1 background thread per xTimer | 1 fd per timer | 1 kernel timer per instance | Shared with event loop |
| Dependencies | heap.h, mpsc.h, task.h | Linux kernel | POSIX RT library | libuv |
Key Differentiator: xbase's timer provides a unique dual-mode design (push/poll) that lets you choose between concurrent execution and single-threaded polling without changing your callback code. The poll mode's lock-free MPSC queue makes it ideal for integration with custom event loops.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbase/timer_bench.cpp
| Benchmark | N | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_Timer_SubmitCancel | — | 149 | 121 | — |
BM_Timer_SubmitBatch | 10 | 1,811 | 1,687 | 5.9 M items/s |
BM_Timer_SubmitBatch | 100 | 11,474 | 9,406 | 10.6 M items/s |
BM_Timer_SubmitBatch | 1,000 | 110,112 | 86,699 | 11.5 M items/s |
BM_Timer_FirePoll | 10 | 3,395 | 3,394 | 2.9 M items/s |
BM_Timer_FirePoll | 100 | 16,897 | 15,534 | 6.4 M items/s |
BM_Timer_FirePoll | 1,000 | 120,411 | 101,190 | 9.9 M items/s |
Key Observations:
- Submit+Cancel cycle takes ~121ns CPU time, reflecting the cost of one heap push + one heap remove. Fast enough for high-frequency timer management.
- Batch submit throughput improves with batch size (5.9M → 11.5M items/s), showing good amortization of per-operation overhead.
- Fire+Poll is slower than submit alone because it includes the MPSC queue transfer and callback invocation. At N=1000, it still achieves ~10M timer fires/s.
task.h — N:M Task Model
Introduction
task.h provides a lightweight N:M concurrent task model where N user tasks are multiplexed onto M OS threads managed by a task group (thread pool). It supports lazy thread creation, configurable queue capacity, per-task result retrieval, and a global shared task group for convenience.
Design Philosophy
-
Lazy Thread Spawning — Worker threads are created on-demand when tasks are submitted and no idle thread is available, up to the configured maximum. This avoids pre-allocating threads that may never be used, reducing resource consumption for bursty workloads.
-
Simple Submit/Wait Model — Tasks are submitted with
xTaskSubmit()and optionally awaited withxTaskWait(). This mirrors the future/promise pattern found in higher-level languages, but in pure C with minimal overhead. -
Configurable Capacity — The task group can be configured with a maximum thread count and queue capacity. When the queue is full,
xTaskSubmit()returns NULL, giving the caller explicit backpressure. -
Global Shared Group —
xTaskGroupGlobal()provides a lazily-initialized, process-wide task group with default settings (unlimited threads, no queue cap). It's automatically destroyed atatexit(), making it convenient for fire-and-forget usage.
Architecture
graph TD
subgraph "Task Group"
QUEUE["Task Queue (FIFO)"]
W1["Worker Thread 1"]
W2["Worker Thread 2"]
WN["Worker Thread N"]
end
APP["Application"] -->|"xTaskSubmit()"| QUEUE
QUEUE -->|"dequeue"| W1
QUEUE -->|"dequeue"| W2
QUEUE -->|"dequeue"| WN
W1 -->|"done"| RESULT["xTaskWait() → result"]
W2 -->|"done"| RESULT
WN -->|"done"| RESULT
style APP fill:#4a90d9,color:#fff
style QUEUE fill:#f5a623,color:#fff
style RESULT fill:#50b86c,color:#fff
Implementation Details
Internal Structure
struct xTask_ {
xTaskFunc fn; // User function
void *arg; // User argument
pthread_mutex_t lock; // Protects done/result
pthread_cond_t cond; // Signals completion
bool done; // Completion flag
void *result; // Return value of fn
struct xTask_ *next; // Intrusive queue linkage
};
struct xTaskGroup_ {
pthread_t *workers; // Dynamic array of worker threads
size_t max_threads; // Upper bound (SIZE_MAX if unlimited)
size_t nthreads; // Currently spawned threads
pthread_mutex_t qlock; // Protects the task queue
pthread_cond_t qcond; // Wakes idle workers
struct xTask_ *qhead, *qtail; // FIFO task queue
size_t qsize, qcap; // Current size and capacity
size_t idle; // Number of idle workers
atomic_size_t pending; // Submitted - finished
atomic_size_t done_count; // Tasks completed
pthread_cond_t wcond; // Dedicated cond for xTaskGroupWait()
bool shutdown; // Shutdown flag
};
Worker Loop
Each worker thread runs worker_loop():
- Acquire lock and increment
idlecount. - Wait on
qcondwhile the queue is empty and not shutting down. - Dequeue one task, decrement
idle. - Execute
task->fn(task->arg). - Signal completion via
pthread_cond_broadcast(&task->cond). - Update counters — decrement
pending, signalwcondif all tasks are done.
Task Submission Flow
flowchart TD
SUBMIT["xTaskSubmit(group, fn, arg)"]
CHECK_CAP{"Queue full?"}
ENQUEUE["Enqueue task"]
CHECK_IDLE{"Idle workers > 0?"}
SIGNAL["Signal qcond"]
CHECK_MAX{"nthreads < max?"}
SPAWN["Spawn new worker"]
DONE["Return task handle"]
FAIL["Return NULL"]
SUBMIT --> CHECK_CAP
CHECK_CAP -->|Yes| FAIL
CHECK_CAP -->|No| ENQUEUE
ENQUEUE --> CHECK_IDLE
CHECK_IDLE -->|Yes| SIGNAL
CHECK_IDLE -->|No| CHECK_MAX
CHECK_MAX -->|Yes| SPAWN
CHECK_MAX -->|No| DONE
SPAWN --> SIGNAL
SIGNAL --> DONE
style SUBMIT fill:#4a90d9,color:#fff
style FAIL fill:#e74c3c,color:#fff
style DONE fill:#50b86c,color:#fff
Separate Wait Conditions
The implementation uses two separate condition variables:
qcond— Wakes idle workers when a new task arrives.wcond— WakesxTaskGroupWait()callers when all tasks complete.
Using a single condition variable caused lost wakeups: pthread_cond_signal() could wake an idle worker instead of the GroupWait caller, leaving it blocked forever.
Global Task Group
xTaskGroupGlobal() uses pthread_once for thread-safe lazy initialization. The group is registered with atexit() for automatic cleanup. It uses default configuration (unlimited threads, no queue cap).
API Reference
Types
| Type | Description |
|---|---|
xTaskFunc | void *(*)(void *arg) — Task function signature. Returns a result pointer. |
xTask | Opaque handle to a submitted task |
xTaskGroup | Opaque handle to a task group (thread pool) |
xTaskGroupConf | Configuration struct: nthreads (0 = auto), queue_cap (0 = unbounded) |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xTaskGroupCreate | xTaskGroup xTaskGroupCreate(const xTaskGroupConf *conf) | Create a task group. NULL conf = defaults. | Not thread-safe |
xTaskGroupDestroy | void xTaskGroupDestroy(xTaskGroup g) | Wait for pending tasks, then destroy. | Not thread-safe |
xTaskSubmit | xTask xTaskSubmit(xTaskGroup g, xTaskFunc fn, void *arg) | Submit a task. Returns NULL if queue is full. | Thread-safe |
xTaskWait | xErrno xTaskWait(xTask t, void **result) | Block until task completes. Frees the task handle. | Thread-safe |
xTaskGroupWait | xErrno xTaskGroupWait(xTaskGroup g) | Block until all pending tasks complete. | Thread-safe |
xTaskGroupThreads | size_t xTaskGroupThreads(xTaskGroup g) | Return number of spawned worker threads. | Thread-safe (atomic read) |
xTaskGroupPending | size_t xTaskGroupPending(xTaskGroup g) | Return number of pending tasks. | Thread-safe (atomic read) |
xTaskGroupGlobal | xTaskGroup xTaskGroupGlobal(void) | Get the global shared task group (lazy init). | Thread-safe |
Usage Examples
Basic Task Submission
#include <stdio.h>
#include <xbase/task.h>
static void *compute(void *arg) {
int *val = (int *)arg;
*val *= 2;
return val;
}
int main(void) {
xTaskGroup group = xTaskGroupCreate(NULL);
int value = 21;
xTask task = xTaskSubmit(group, compute, &value);
void *result;
xTaskWait(task, &result);
printf("Result: %d\n", *(int *)result); // 42
xTaskGroupDestroy(group);
return 0;
}
Parallel Map
#include <stdio.h>
#include <xbase/task.h>
#define N 8
static void *square(void *arg) {
int *val = (int *)arg;
*val = (*val) * (*val);
return val;
}
int main(void) {
xTaskGroupConf conf = { .nthreads = 4, .queue_cap = 0 };
xTaskGroup group = xTaskGroupCreate(&conf);
int data[N] = {1, 2, 3, 4, 5, 6, 7, 8};
xTask tasks[N];
for (int i = 0; i < N; i++)
tasks[i] = xTaskSubmit(group, square, &data[i]);
// Wait for all
xTaskGroupWait(group);
for (int i = 0; i < N; i++)
printf("data[%d] = %d\n", i, data[i]);
// Clean up task handles
for (int i = 0; i < N; i++)
xTaskWait(tasks[i], NULL);
xTaskGroupDestroy(group);
return 0;
}
Using the Global Task Group
#include <stdio.h>
#include <xbase/task.h>
static void *work(void *arg) {
printf("Running on global pool: %s\n", (char *)arg);
return NULL;
}
int main(void) {
xTask t = xTaskSubmit(xTaskGroupGlobal(), work, "hello");
xTaskWait(t, NULL);
// No need to destroy the global group
return 0;
}
Use Cases
-
CPU-Bound Parallel Processing — Distribute computation across multiple cores. Use
xTaskGroupWait()to synchronize at barriers. -
Event Loop Offload — The event loop's
xEventLoopSubmit()usesxTaskGroupinternally to run work functions on worker threads, then delivers results back to the loop thread. -
Background I/O — Offload blocking file I/O (e.g.,
fsync, large reads) to a thread pool to keep the main thread responsive.
Best Practices
- Always call
xTaskWait()or letxTaskGroupDestroy()clean up. EachxTaskSubmit()allocates a task struct with a mutex and condvar.xTaskWait()frees them. Leaking task handles leaks resources. - Set
queue_capfor backpressure. Without a cap, unbounded submission can exhaust memory. A bounded queue lets you detect overload via NULL returns fromxTaskSubmit(). - Don't destroy the global group.
xTaskGroupGlobal()is managed internally and destroyed atatexit(). Passing it toxTaskGroupDestroy()is undefined behavior. - Use
xTaskGroupWait()for barriers, not busy-polling. It uses a dedicated condition variable and blocks efficiently.
Comparison with Other Libraries
| Feature | xbase task.h | pthread | C11 threads | GCD (libdispatch) |
|---|---|---|---|---|
| Abstraction | Task (submit/wait) | Thread (create/join) | Thread (create/join) | Block (dispatch_async) |
| Thread Management | Automatic (lazy spawn) | Manual | Manual | Automatic |
| Queue | Built-in FIFO with cap | N/A | N/A | Built-in (serial/concurrent) |
| Result Retrieval | xTaskWait(t, &result) | pthread_join(t, &result) | thrd_join(t, &result) | Completion handler |
| Group Wait | xTaskGroupWait() | Manual barrier | Manual barrier | dispatch_group_wait() |
| Backpressure | queue_cap → NULL on full | N/A | N/A | N/A (unbounded) |
| Global Pool | xTaskGroupGlobal() | N/A | N/A | dispatch_get_global_queue() |
| Platform | macOS + Linux | POSIX | C11 | macOS + Linux (via libdispatch) |
| Dependencies | pthread | OS | OS | OS / libdispatch |
Key Differentiator: xbase's task model provides a simple, portable thread pool with lazy spawning and explicit backpressure — features that require significant boilerplate with raw pthreads. Unlike GCD, it gives you direct control over thread count and queue capacity.
memory.h — Reference-Counted Memory Management
Introduction
memory.h provides a vtable-driven, reference-counted memory management system for C. It enables object lifecycle management (construction, destruction, retain, release, copy, move) through a virtual table pattern, bringing RAII-like semantics to pure C. The XMALLOC(T) macro allocates an object with an embedded header that tracks the reference count and vtable pointer.
Design Philosophy
-
vtable-Driven Lifecycle — Each object type defines a static
xVTablewith optional function pointers forctor,dtor,retain,release,copy, andmove. This decouples lifecycle logic from the allocation mechanism, similar to C++ virtual destructors or Objective-C's class methods. -
Hidden Header Pattern — A
Headerstruct is prepended to every allocation, storing the type name (for debugging), size, reference count, and vtable pointer. The user receives a pointer past the header, so the header is invisible to normal usage. -
Atomic Reference Counting —
xRetain()andxRelease()use atomic operations (__ATOMIC_SEQ_CST) to safely manage reference counts across threads. When the count reaches zero, the destructor is called and memory is freed. -
Macro Convenience —
XMALLOC(T)andXMALLOCEX(T, sz)macros generate the correctxAlloc()call with the type name string, size, and vtable pointer, reducing boilerplate.
Architecture
graph TD
MACRO["XMALLOC(T) / XMALLOCEX(T, sz)"]
ALLOC["xAlloc(name, size, count, vtab)"]
HEADER["Header + Object"]
RETAIN["xRetain(ptr)<br/>atomic refs++"]
RELEASE["xRelease(ptr)<br/>atomic refs--"]
FREE["xFree(ptr)<br/>dtor + free"]
COPY["xCopy(ptr, other)"]
MOVE["xMove(ptr, other)"]
MACRO --> ALLOC
ALLOC --> HEADER
HEADER --> RETAIN
HEADER --> RELEASE
RELEASE -->|"refs == 0"| FREE
HEADER --> COPY
HEADER --> MOVE
style MACRO fill:#4a90d9,color:#fff
style RELEASE fill:#e74c3c,color:#fff
style FREE fill:#e74c3c,color:#fff
Implementation Details
Memory Layout
graph LR
subgraph "malloc'd block"
HDR["Header<br/>name | size | refs | vtab"]
OBJ["User Object<br/>(sizeof(T) bytes)"]
EXTRA["Extra bytes<br/>(XMALLOCEX only)"]
end
PTR["xAlloc() returns →"] --> OBJ
style HDR fill:#f5a623,color:#fff
style OBJ fill:#4a90d9,color:#fff
style EXTRA fill:#50b86c,color:#fff
The actual memory layout:
┌──────────────────────────────────────────────────────┐
│ Header (hidden) │
│ const char *name — type name string (e.g. "Foo") │
│ size_t size — sizeof(T) │
│ size_t refs — reference count (starts at 1) │
│ xVTable *vtab — pointer to static vtable │
├──────────────────────────────────────────────────────┤
│ User Object (returned pointer) │
│ T fields... │
│ [optional extra bytes from XMALLOCEX] │
└──────────────────────────────────────────────────────┘
XMALLOC / XMALLOCEX Macro Expansion
// Given:
typedef struct Foo Foo;
struct Foo { int x; char buf[]; };
XDEF_VTABLE(Foo) { .ctor = FooCtor, .dtor = FooDtor };
XDEF_CTOR(Foo) { self->x = 0; }
XDEF_DTOR(Foo) { /* cleanup */ }
// XMALLOC(Foo) expands to:
(Foo *)xAlloc("Foo", sizeof(Foo), 1, &FooVTable)
// XMALLOCEX(Foo, 128) expands to:
(Foo *)xAlloc("Foo", sizeof(Foo) + 128, 1, &FooVTable)
Reference Count Lifecycle
sequenceDiagram
participant App
participant Alloc as xAlloc
participant Header
participant VTable
App->>Alloc: XMALLOC(Foo)
Alloc->>Header: malloc(sizeof(Header) + sizeof(Foo))
Alloc->>Header: refs = 1
Alloc->>VTable: vtab->ctor(ptr)
Alloc-->>App: Foo *ptr
App->>Header: xRetain(ptr) → refs = 2
App->>Header: xRelease(ptr) → refs = 1
App->>Header: xRelease(ptr) → refs = 0
Header->>VTable: vtab->release(ptr)
Header->>VTable: vtab->dtor(ptr)
Header->>Header: free(hdr)
Thread Safety
xRetain()andxRelease()are thread-safe — they usexAtomicAdd/xAtomicSubwith sequential consistency ordering.xAlloc(),xFree(),xCopy(), andxMove()are not thread-safe — they should be called from a single owner or with external synchronization.
API Reference
Macros
| Macro | Expansion | Description |
|---|---|---|
XDEF_VTABLE(T) | static xVTable TVTable = | Define a static vtable for type T |
XDEF_CTOR(T) | static void TCtor(T *self) | Define a constructor for type T |
XDEF_DTOR(T) | static void TDtor(T *self) | Define a destructor for type T |
XMALLOC(T) | (T *)xAlloc("T", sizeof(T), 1, &TVTable) | Allocate one T with vtable |
XMALLOCEX(T, sz) | (T *)xAlloc("T", sizeof(T) + sz, 1, &TVTable) | Allocate T + extra bytes |
Types
| Type | Description |
|---|---|
xVTable | Struct with function pointers: ctor, dtor, retain, release, copy, move |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xAlloc | void *xAlloc(const char *name, size_t size, size_t count, xVTable *vtab) | Allocate object(s) with header and call ctor. | Not thread-safe |
xFree | void xFree(void *ptr) | Call dtor and free. Ignores NULL. | Not thread-safe |
xRetain | void xRetain(void *ptr) | Increment reference count atomically. Calls vtab->retain if set. | Thread-safe |
xRelease | void xRelease(void *ptr) | Decrement reference count atomically. Calls vtab->release then xFree when refs reach 0. | Thread-safe |
xCopy | void xCopy(void *ptr, void *other) | Call vtab->copy if set. | Not thread-safe |
xMove | void xMove(void *ptr, void *other) | Call vtab->move if set. | Not thread-safe |
Usage Examples
Basic Object with Constructor/Destructor
#include <stdio.h>
#include <string.h>
#include <xbase/memory.h>
typedef struct Connection Connection;
struct Connection {
int fd;
char host[256];
};
XDEF_CTOR(Connection) {
self->fd = -1;
memset(self->host, 0, sizeof(self->host));
printf("Connection created\n");
}
XDEF_DTOR(Connection) {
if (self->fd >= 0) {
// close(self->fd);
printf("Connection closed (fd=%d)\n", self->fd);
}
}
XDEF_VTABLE(Connection) {
.ctor = ConnectionCtor,
.dtor = ConnectionDtor,
};
int main(void) {
Connection *conn = XMALLOC(Connection);
conn->fd = 42;
strcpy(conn->host, "example.com");
xRetain(conn); // refs = 2
xRelease(conn); // refs = 1
xRelease(conn); // refs = 0 → dtor called → freed
return 0;
}
Flexible Array Member with XMALLOCEX
#include <stdio.h>
#include <string.h>
#include <xbase/memory.h>
typedef struct Buffer Buffer;
struct Buffer {
size_t len;
char data[]; // flexible array member
};
XDEF_CTOR(Buffer) { self->len = 0; }
XDEF_DTOR(Buffer) { /* nothing to clean up */ }
XDEF_VTABLE(Buffer) { .ctor = BufferCtor, .dtor = BufferDtor };
int main(void) {
// Allocate Buffer + 1024 extra bytes for data[]
Buffer *buf = XMALLOCEX(Buffer, 1024);
memcpy(buf->data, "Hello, xKit!", 12);
buf->len = 12;
printf("Buffer: %.*s\n", (int)buf->len, buf->data);
xRelease(buf); // refs 1 → 0 → freed
return 0;
}
Use Cases
-
Shared Ownership — Multiple components hold references to the same object (e.g., a connection shared between a reader and a writer).
xRetain/xReleaseensures the object is freed only when the last reference is dropped. -
Plugin/Extension Objects — Define vtables for different object types that share a common interface. The vtable pattern enables polymorphic behavior in C.
-
Debug-Friendly Allocation — The
namefield in the header enables allocation tracking and leak detection by type name.
Best Practices
- Always pair
xRetainwithxRelease. Every retain must have a corresponding release, or you'll leak memory. - Use
XMALLOCinstead of rawxAlloc. The macro handles type name, size, and vtable automatically. - Set unused vtable fields to NULL. The implementation checks for NULL before calling each vtable function.
- Don't mix with
free(). Objects allocated withxAllochave a hidden header. Callingfree()directly on the user pointer corrupts the heap. - Use
XMALLOCEXfor flexible array members. It adds extra bytes after the struct for variable-length data.
Comparison with Other Libraries
| Feature | xbase memory.h | C++ RAII | Objective-C ARC | GLib GObject |
|---|---|---|---|---|
| Mechanism | vtable + atomic refcount | Destructor + smart pointers | Compiler-inserted retain/release | GType + refcount |
| Automation | Manual retain/release | Automatic (scope-based) | Automatic (compiler) | Manual ref/unref |
| Thread Safety | Atomic refcount | shared_ptr is atomic | Atomic | Atomic |
| Polymorphism | vtable function pointers | Virtual functions | Method dispatch | Signal/slot + vtable |
| Overhead | 1 header per object (~32 bytes) | 0 (stack) or control block | 1 isa pointer + refcount | Large (GTypeInstance) |
| Flexible Arrays | XMALLOCEX(T, sz) | std::vector | NSMutableData | GArray |
| Debug Info | Type name in header | RTTI | Class name | GType name |
| Language | C99 | C++ | Objective-C | C (with macros) |
Key Differentiator: xbase's memory system brings reference-counted lifecycle management to C with minimal overhead — just a 32-byte header per object. The vtable pattern provides extensibility (custom ctor/dtor/copy/move) without requiring a complex type system like GObject.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbase/memory_bench.cpp
| Benchmark | Size (bytes) | Time (ns) | CPU (ns) | Iterations |
|---|---|---|---|---|
BM_Memory_XAlloc | 16 | 23.3 | 23.3 | 29,809,940 |
BM_Memory_XAlloc | 64 | 21.1 | 21.1 | 32,551,024 |
BM_Memory_XAlloc | 256 | 22.4 | 22.4 | 31,207,508 |
BM_Memory_XAlloc | 1,024 | 20.1 | 20.1 | 34,024,352 |
BM_Memory_XAlloc | 4,096 | 24.2 | 24.2 | 29,002,681 |
BM_Memory_Malloc | 16 | 17.5 | 17.5 | 39,883,995 |
BM_Memory_Malloc | 64 | 18.7 | 18.7 | 37,576,831 |
BM_Memory_Malloc | 256 | 19.0 | 19.0 | 34,505,536 |
BM_Memory_Malloc | 1,024 | 23.0 | 23.0 | 30,557,144 |
BM_Memory_Malloc | 4,096 | 17.7 | 17.7 | 39,849,483 |
BM_Memory_RetainRelease | — | 3.90 | 3.90 | 183,068,277 |
Key Observations:
- xAlloc vs malloc overhead is only ~3–5ns across all sizes. The extra cost covers header initialization, vtable setup, and constructor invocation — negligible for most workloads.
- Retain/Release cycle takes ~3.9ns, dominated by the atomic increment/decrement. This is fast enough for hot-path reference counting.
- Allocation time is nearly constant across sizes (16B–4KB), confirming that the overhead is in the header management, not the underlying
malloc.
error.h — Unified Error Codes
Introduction
error.h defines a unified set of error codes (xErrno) used throughout xKit. Every function that can fail returns an xErrno value, providing a consistent error handling pattern across all modules. The companion function xstrerror() converts error codes to human-readable strings for logging and debugging.
Design Philosophy
-
Single Error Enum — All xKit modules share one error code enum, avoiding the confusion of module-specific error types. This makes error handling uniform: check for
xErrno_Okeverywhere. -
Descriptive Codes — Each error code maps to a specific failure category (invalid argument, out of memory, wrong state, etc.), giving callers enough information to decide how to handle the error without inspecting errno or platform-specific codes.
-
Human-Readable Messages —
xstrerror()returns a static string for each code, suitable for direct inclusion in log messages. It never returns NULL.
Architecture
graph LR
MODULES["All xKit Modules"] -->|"return"| ERRNO["xErrno"]
ERRNO -->|"xstrerror()"| MSG["Human-readable string"]
MSG -->|"xLog()"| LOG["Log output"]
style ERRNO fill:#4a90d9,color:#fff
style MSG fill:#50b86c,color:#fff
Implementation Details
Error Code Values
The error codes are defined as an int-based enum (via XDEF_ENUM), starting from 0:
| Code | Value | Meaning |
|---|---|---|
xErrno_Ok | 0 | Success |
xErrno_Unknown | 1 | Unspecified error (legacy / catch-all) |
xErrno_InvalidArg | 2 | NULL or invalid argument |
xErrno_NoMemory | 3 | Memory allocation failed |
xErrno_InvalidState | 4 | Object is in the wrong state for this call |
xErrno_SysError | 5 | Underlying syscall / OS error |
xErrno_NotFound | 6 | Requested item does not exist |
xErrno_AlreadyExists | 7 | Item already registered / bound |
xErrno_Cancelled | 8 | Operation was cancelled |
Usage Pattern
The idiomatic xKit error handling pattern:
xErrno err = xSomeFunction(args);
if (err != xErrno_Ok) {
xLog(false, "operation failed: %s", xstrerror(err));
return err; // propagate
}
Internal Usage
xErrno is used by:
- event.h —
xEventMod(),xEventDel(),xEventWake(),xEventLoopTimerCancel(),xEventLoopSubmit(),xEventLoopSignalWatch() - timer.h —
xTimerCancel() - task.h —
xTaskWait(),xTaskGroupWait() - socket.h —
xSocketSetMask(),xSocketSetTimeout() - heap.h —
xHeapPush(),xHeapUpdate()
API Reference
Types
| Type | Description |
|---|---|
xErrno | int-based enum of error codes |
Enum Values
| Value | Description |
|---|---|
xErrno_Ok | Success |
xErrno_Unknown | Unspecified error (legacy / catch-all) |
xErrno_InvalidArg | NULL or invalid argument |
xErrno_NoMemory | Memory allocation failed |
xErrno_InvalidState | Object is in the wrong state for this call |
xErrno_SysError | Underlying syscall / OS error |
xErrno_NotFound | Requested item does not exist |
xErrno_AlreadyExists | Item already registered / bound |
xErrno_Cancelled | Operation was cancelled |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xstrerror | const char *xstrerror(xErrno err) | Return a human-readable error message. Never returns NULL. | Thread-safe (returns static strings) |
Usage Examples
Error Handling Pattern
#include <stdio.h>
#include <xbase/error.h>
#include <xbase/event.h>
int main(void) {
xEventLoop loop = xEventLoopCreate();
if (!loop) {
fprintf(stderr, "Failed to create event loop\n");
return 1;
}
xErrno err = xEventMod(loop, NULL, xEvent_Read);
if (err != xErrno_Ok) {
fprintf(stderr, "xEventMod failed: %s\n", xstrerror(err));
// Output: "xEventMod failed: NULL or invalid argument"
}
xEventLoopDestroy(loop);
return 0;
}
Propagating Errors
#include <xbase/error.h>
#include <xbase/socket.h>
xErrno setup_socket(xEventLoop loop, xSocket *out) {
xSocket sock = xSocketCreate(loop, AF_INET, SOCK_STREAM, 0,
xEvent_Read, my_callback, NULL);
if (!sock) return xErrno_SysError;
xErrno err = xSocketSetTimeout(sock, 5000, 0);
if (err != xErrno_Ok) {
xSocketDestroy(loop, sock);
return err;
}
*out = sock;
return xErrno_Ok;
}
Use Cases
-
Uniform Error Propagation — Functions return
xErrnoand callers check againstxErrno_Ok. This eliminates the need for module-specific error types. -
Logging and Diagnostics —
xstrerror()provides instant human-readable messages for log output without maintaining separate message tables. -
Error Classification — Callers can switch on specific error codes to implement different recovery strategies (e.g., retry on
xErrno_SysError, abort onxErrno_NoMemory).
Best Practices
- Always check return values. Functions that return
xErrnoshould be checked. Functions that return handles (pointers) should be checked for NULL. - Use
xstrerror()in log messages. It's more informative than printing the raw integer. - Don't compare against raw integers. Always use the enum constants (
xErrno_Ok,xErrno_InvalidArg, etc.) for readability and forward compatibility. - Prefer specific codes over
xErrno_Unknown. When adding new error paths, choose the most specific applicable code.
Comparison with Other Libraries
| Feature | xbase error.h | POSIX errno | Windows HRESULT | GLib GError |
|---|---|---|---|---|
| Type | int enum | int (thread-local) | LONG | Struct (domain + code + message) |
| Scope | Library-wide | System-wide | System-wide | Per-domain |
| String Conversion | xstrerror() | strerror() | FormatMessage() | g_error->message |
| Thread Safety | Return value (inherently safe) | Thread-local global | Return value | Heap-allocated |
| Extensibility | Add to enum | Platform-defined | Facility codes | Custom domains |
| Overhead | Zero (int return) | Zero (thread-local) | Zero (int return) | Heap allocation per error |
Key Differentiator: xbase's error system is intentionally simple — a single enum with descriptive codes and a string conversion function. It avoids the complexity of domain-based systems (GError) and the thread-local pitfalls of POSIX errno, while providing enough granularity for library-level error handling.
heap.h — Min-Heap
Introduction
heap.h provides a generic binary min-heap that stores opaque pointers and orders them via a user-supplied comparison function. Each element carries its heap index (maintained via a callback), enabling O(log n) removal and priority updates by index. It is the core data structure behind xbase's timer subsystem.
Design Philosophy
-
Generic via Function Pointers — The heap stores
void *elements and uses axHeapCmpFuncfor ordering. This makes it reusable for any element type without code generation or macros. -
Index Tracking — A
xHeapSetIdxFunccallback notifies elements of their current position in the heap array. This enables O(1) lookup forxHeapRemove()andxHeapUpdate(), which would otherwise require O(n) search. -
Dynamic Array Backend — The heap uses a dynamically-growing array (2x expansion) starting from a default capacity of 16. This provides cache-friendly access patterns and amortized O(1) growth.
-
No Element Ownership — The heap does not own the elements it stores.
xHeapDestroy()frees the heap structure but NOT the elements. This gives the caller full control over element lifecycle.
Architecture
graph TD
PUSH["xHeapPush(elem)"] --> APPEND["Append to data[size]"]
APPEND --> SIFTUP["Sift Up"]
SIFTUP --> NOTIFY["setidx(elem, new_idx)"]
POP["xHeapPop()"] --> SWAP["Swap data[0] with data[size-1]"]
SWAP --> SIFTDOWN["Sift Down from 0"]
SIFTDOWN --> NOTIFY
REMOVE["xHeapRemove(idx)"] --> SWAP2["Swap data[idx] with data[size-1]"]
SWAP2 --> BOTH["Sift Up + Sift Down"]
BOTH --> NOTIFY
style PUSH fill:#4a90d9,color:#fff
style POP fill:#f5a623,color:#fff
style REMOVE fill:#e74c3c,color:#fff
Implementation Details
Data Structure
struct xHeap_ {
void **data; // Dynamic array of element pointers
size_t size; // Current number of elements
size_t cap; // Allocated capacity
xHeapCmpFunc cmp; // Comparison function
xHeapSetIdxFunc setidx; // Index notification callback
};
Array Layout
Index: 0 1 2 3 4 5 6
[min] [ ] [ ] [ ] [ ] [ ] [ ]
│ │ │
│ ├────┤
│ children of 0
├─────┤
parent of 1,2
Parent of i: (i - 1) / 2
Left child of i: 2 * i + 1
Right child of i: 2 * i + 2
Operations and Complexity
| Operation | Function | Time Complexity | Description |
|---|---|---|---|
| Insert | xHeapPush | O(log n) | Append to end, sift up |
| Peek min | xHeapPeek | O(1) | Return data[0] |
| Extract min | xHeapPop | O(log n) | Swap with last, sift down |
| Remove by index | xHeapRemove | O(log n) | Swap with last, sift up + down |
| Update priority | xHeapUpdate | O(log n) | Sift up + down at index |
| Size | xHeapSize | O(1) | Return size field |
| Grow | ensure_cap | Amortized O(1) | 2x realloc |
Sift Operations
- Sift Up — Compare element with parent; swap if smaller. Repeat until heap property is restored or root is reached.
- Sift Down — Compare element with children; swap with the smallest child if it's smaller. Repeat until heap property is restored or a leaf is reached.
Remove by Index
xHeapRemove(h, idx) replaces the element at idx with the last element, then applies both sift-up and sift-down. This handles both cases: the replacement may be smaller (needs to go up) or larger (needs to go down) than its new neighbors.
API Reference
Types
| Type | Description |
|---|---|
xHeapCmpFunc | int (*)(const void *a, const void *b) — Returns negative if a < b, 0 if equal, positive if a > b |
xHeapSetIdxFunc | void (*)(void *elem, size_t idx) — Called when an element's index changes |
xHeap | Opaque handle to a min-heap |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHeapCreate | xHeap xHeapCreate(xHeapCmpFunc cmp, xHeapSetIdxFunc setidx, size_t cap) | Create a heap. cap = 0 uses default (16). | Not thread-safe |
xHeapDestroy | void xHeapDestroy(xHeap h) | Free the heap. Does NOT free elements. | Not thread-safe |
xHeapPush | xErrno xHeapPush(xHeap h, void *elem) | Insert an element. O(log n). | Not thread-safe |
xHeapPeek | void *xHeapPeek(xHeap h) | Return the minimum element without removing. O(1). | Not thread-safe |
xHeapPop | void *xHeapPop(xHeap h) | Remove and return the minimum element. O(log n). | Not thread-safe |
xHeapRemove | void *xHeapRemove(xHeap h, size_t idx) | Remove element at index. O(log n). | Not thread-safe |
xHeapUpdate | xErrno xHeapUpdate(xHeap h, size_t idx) | Re-heapify after priority change. O(log n). | Not thread-safe |
xHeapSize | size_t xHeapSize(xHeap h) | Return element count. O(1). | Not thread-safe |
Usage Examples
Timer-Style Priority Queue
#include <stdio.h>
#include <stdlib.h>
#include <xbase/heap.h>
typedef struct {
uint64_t deadline;
size_t heap_idx;
char name[32];
} TimerEntry;
static int cmp_entry(const void *a, const void *b) {
const TimerEntry *ea = (const TimerEntry *)a;
const TimerEntry *eb = (const TimerEntry *)b;
if (ea->deadline < eb->deadline) return -1;
if (ea->deadline > eb->deadline) return 1;
return 0;
}
static void set_idx(void *elem, size_t idx) {
((TimerEntry *)elem)->heap_idx = idx;
}
int main(void) {
xHeap heap = xHeapCreate(cmp_entry, set_idx, 0);
TimerEntry entries[] = {
{ .deadline = 300, .name = "C" },
{ .deadline = 100, .name = "A" },
{ .deadline = 200, .name = "B" },
};
for (int i = 0; i < 3; i++)
xHeapPush(heap, &entries[i]);
// Pop in order: A (100), B (200), C (300)
while (xHeapSize(heap) > 0) {
TimerEntry *e = (TimerEntry *)xHeapPop(heap);
printf("%s (deadline=%llu)\n", e->name, e->deadline);
}
xHeapDestroy(heap);
return 0;
}
Use Cases
-
Timer Subsystem —
timer.huses the min-heap to order timer entries by deadline. The timer thread peeks at the minimum to determine how long to sleep, then pops expired entries. -
Event Loop Timers — The event loop's builtin timer heap (
event.h) uses the same pattern to integrate timer dispatch with I/O polling. -
Custom Priority Queues — Any scenario requiring efficient insert/extract-min with O(log n) removal by index.
Best Practices
- Always implement
xHeapSetIdxFunc. Without index tracking,xHeapRemove()andxHeapUpdate()cannot locate elements efficiently. - Store the index in your element struct. The
setidxcallback should write the index into a field of your element (e.g.,elem->heap_idx = idx). - Don't free elements while they're in the heap. Remove them first with
xHeapRemove()orxHeapPop(). - Use
xHeapUpdate()after changing an element's priority. The heap doesn't detect priority changes automatically.
Comparison with Other Libraries
| Feature | xbase heap.h | C++ std::priority_queue | Linux kernel prio_heap | Go container/heap |
|---|---|---|---|---|
| Element Type | void * (generic) | Template | Fixed struct | interface{} |
| Index Tracking | Built-in (setidx callback) | Not available | Not available | Manual (Fix method) |
| Remove by Index | O(log n) | Not supported | Not supported | O(log n) via Remove |
| Update Priority | O(log n) via xHeapUpdate | Not supported | Not supported | O(log n) via Fix |
| Ownership | No (caller owns elements) | Yes (copies/moves) | No | No |
| Thread Safety | Not thread-safe | Not thread-safe | Not thread-safe | Not thread-safe |
Key Differentiator: xbase's heap provides built-in index tracking via the setidx callback, enabling O(log n) removal and priority updates — features that std::priority_queue lacks entirely. This makes it ideal for timer implementations where cancellation is a common operation.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbase/heap_bench.cpp
| Benchmark | N | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_Heap_Push | 8 | 983 | 987 | 8.1 M items/s |
BM_Heap_Push | 64 | 1,694 | 1,699 | 37.7 M items/s |
BM_Heap_Push | 512 | 8,722 | 8,725 | 58.7 M items/s |
BM_Heap_Push | 4,096 | 56,854 | 56,853 | 72.0 M items/s |
BM_Heap_Pop | 8 | 1,020 | 1,024 | 7.8 M items/s |
BM_Heap_Pop | 64 | 2,807 | 2,809 | 22.8 M items/s |
BM_Heap_Pop | 512 | 26,334 | 26,337 | 19.4 M items/s |
BM_Heap_Pop | 4,096 | 297,382 | 297,325 | 13.8 M items/s |
BM_Heap_Remove | 8 | 1,015 | 1,020 | 7.8 M items/s |
BM_Heap_Remove | 64 | 1,808 | 1,811 | 35.3 M items/s |
BM_Heap_Remove | 512 | 8,914 | 8,903 | 57.5 M items/s |
BM_Heap_Remove | 4,096 | 68,017 | 68,016 | 60.2 M items/s |
Key Observations:
- Push throughput scales well with heap size — amortized cost per element decreases as batch size grows, reaching 72M items/s at N=4096.
- Pop is more expensive than push at large N due to the sift-down operation traversing more levels. At N=4096, pop throughput drops to ~14M items/s.
- Remove (random index removal) performs comparably to push, thanks to the O(log n) index-tracked removal. This validates the
setidxcallback design for timer cancellation workloads.
mpsc.h — Lock-Free MPSC Queue
Introduction
mpsc.h provides a lock-free, intrusive multi-producer single-consumer (MPSC) queue. Multiple threads can push nodes concurrently without locks, while a single consumer thread pops nodes. It is the backbone of xbase's poll-mode timer dispatch and the event loop's offload completion queue.
Design Philosophy
-
Intrusive Design — Nodes embed an
xMpscstruct directly, avoiding heap allocation per enqueue. This is critical for hot paths like timer expiry and offload completion where allocation overhead would be unacceptable. -
Lock-Free Push —
xMpscPush()uses a single atomic exchange (xAtomicXchg) on the tail pointer, making it wait-free for producers. No mutex, no CAS retry loop. -
Single-Consumer Pop —
xMpscPop()is designed for exactly one consumer thread. It uses atomic loads and a single CAS for the edge case of popping the last element. This simplification avoids the ABA problem that plagues multi-consumer designs. -
Minimal Memory Ordering — The implementation uses
xAtomicAcqRelfor the exchange andxAtomicAcquire/xAtomicReleasefor loads/stores, providing the minimum ordering needed for correctness without the overhead of sequential consistency.
Architecture
graph LR
P1["Producer 1"] -->|"xMpscPush"| TAIL["tail"]
P2["Producer 2"] -->|"xMpscPush"| TAIL
P3["Producer 3"] -->|"xMpscPush"| TAIL
HEAD["head"] -->|"xMpscPop"| C["Consumer"]
subgraph "Queue"
HEAD --> N1["Node 1"] --> N2["Node 2"] --> N3["Node 3"]
N3 --- TAIL
end
style P1 fill:#4a90d9,color:#fff
style P2 fill:#4a90d9,color:#fff
style P3 fill:#4a90d9,color:#fff
style C fill:#50b86c,color:#fff
Implementation Details
Data Structure
XDEF_STRUCT(xMpsc) {
xMpsc *volatile next; // Pointer to next node
};
The queue is represented by two external pointers:
head— Points to the oldest node (consumer reads from here)tail— Points to the newest node (producers append here)
Push Algorithm
void xMpscPush(xMpsc **head, xMpsc **tail, xMpsc *node) {
node->next = NULL;
xMpsc *prev_tail = xAtomicXchg(tail, node, xAtomicAcqRel);
if (prev_tail)
prev_tail->next = node; // Link to previous tail
else
xAtomicStore(head, node, xAtomicRelease); // First node
}
The key insight: xAtomicXchg atomically replaces the tail and returns the old value. If the old tail was non-NULL, we link it to the new node. If it was NULL (empty queue), we also update the head.
Pop Algorithm
The pop operation handles three cases:
- Empty queue —
headis NULL, return NULL. - Multiple nodes — Advance
headtohead->next, return old head. - Single node — CAS
tailto NULL. If CAS succeeds, also CASheadto NULL. If CAS fails (concurrent push in progress), spin untilhead->nextbecomes non-NULL.
flowchart TD
START["xMpscPop()"]
CHECK_HEAD{"head == NULL?"}
EMPTY["Return NULL"]
CHECK_NEXT{"head->next == NULL?"}
MULTI["Advance head<br/>Return old head"]
CAS_TAIL{"CAS tail → NULL?"}
CAS_HEAD["CAS head → NULL<br/>Return old head"]
SPIN["Spin until head->next != NULL"]
ADVANCE["Advance head<br/>Return old head"]
START --> CHECK_HEAD
CHECK_HEAD -->|Yes| EMPTY
CHECK_HEAD -->|No| CHECK_NEXT
CHECK_NEXT -->|No| MULTI
CHECK_NEXT -->|Yes| CAS_TAIL
CAS_TAIL -->|Success| CAS_HEAD
CAS_TAIL -->|Fail: concurrent push| SPIN
SPIN --> ADVANCE
style EMPTY fill:#e74c3c,color:#fff
style MULTI fill:#50b86c,color:#fff
style CAS_HEAD fill:#50b86c,color:#fff
style ADVANCE fill:#50b86c,color:#fff
Memory Ordering Analysis
| Operation | Ordering | Reason |
|---|---|---|
xAtomicXchg(tail, node) | AcqRel | Acquire: see previous tail's next field. Release: make node visible to consumer. |
xAtomicStore(head, node) | Release | Make the new head visible to the consumer. |
xAtomicLoad(head) | Acquire | See the node written by the producer. |
xAtomicLoad(&head->next) | Acquire | See the next pointer written by the producer. |
xAtomicCasStrong(tail, ...) | Release | Publish the NULL tail to concurrent pushers. |
Thread Safety
xMpscPush()— Thread-safe (multiple producers).xMpscPop()— Single-consumer only. Must not be called concurrently.xMpscEmpty()— Thread-safe (atomic load).
API Reference
Types
| Type | Description |
|---|---|
xMpsc | Intrusive queue node. Embed in your struct and use xContainerOf() to recover the enclosing struct. |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xMpscPush | void xMpscPush(xMpsc **head, xMpsc **tail, xMpsc *node) | Push a node. Wait-free for producers. | Thread-safe (multi-producer) |
xMpscPop | xMpsc *xMpscPop(xMpsc **head, xMpsc **tail) | Pop the oldest node. Returns NULL if empty. | Single-consumer only |
xMpscEmpty | bool xMpscEmpty(xMpsc **head) | Check if the queue is empty. | Thread-safe |
Usage Examples
Basic Producer-Consumer
#include <stdio.h>
#include <pthread.h>
#include <xbase/mpsc.h>
#include <xbase/base.h>
typedef struct {
xMpsc node; // Must embed xMpsc
int value;
} Message;
static xMpsc *g_head = NULL;
static xMpsc *g_tail = NULL;
static void *producer(void *arg) {
Message *msg = (Message *)arg;
xMpscPush(&g_head, &g_tail, &msg->node);
return NULL;
}
int main(void) {
Message msgs[] = {
{ .value = 1 },
{ .value = 2 },
{ .value = 3 },
};
// Push from multiple threads
pthread_t threads[3];
for (int i = 0; i < 3; i++)
pthread_create(&threads[i], NULL, producer, &msgs[i]);
for (int i = 0; i < 3; i++)
pthread_join(threads[i], NULL);
// Pop from single consumer
xMpsc *node;
while ((node = xMpscPop(&g_head, &g_tail)) != NULL) {
Message *msg = xContainerOf(node, Message, node);
printf("Received: %d\n", msg->value);
}
return 0;
}
Use Cases
-
Timer Poll Mode —
timer.huses the MPSC queue in poll mode to pass expired timer entries from the timer thread to the polling thread without locks. -
Event Loop Offload — The event loop's offload mechanism (
event.h) uses an MPSC queue to deliver completed work items from worker threads to the event loop thread. -
xlog Async Logger —
logger.huses the MPSC queue to pass log messages from application threads to the logger's flush thread.
Best Practices
- Embed
xMpscin your struct. Don't allocatexMpscnodes separately. UsexContainerOf()to recover the enclosing struct after popping. - Initialize head and tail to NULL. An empty queue has both pointers set to NULL.
- Only one thread may call
xMpscPop(). The single-consumer constraint is fundamental to the algorithm's correctness. Violating it causes data races. - Don't access a node after pushing it. Once pushed, the node is owned by the queue until popped.
Comparison with Other Libraries
| Feature | xbase mpsc.h | Dmitry Vyukov MPSC | concurrentqueue (C++) | Linux llist |
|---|---|---|---|---|
| Design | Intrusive, lock-free | Intrusive, lock-free | Non-intrusive, lock-free | Intrusive, lock-free |
| Push | Wait-free (1 atomic xchg) | Wait-free (1 atomic xchg) | Lock-free (CAS loop) | Wait-free (1 atomic xchg) |
| Pop | Lock-free (single consumer) | Lock-free (single consumer) | Lock-free (multi-consumer) | Batch pop (splice) |
| Memory Ordering | AcqRel / Acquire / Release | SeqCst | Relaxed + fences | Varies |
| Allocation | None (intrusive) | None (intrusive) | Per-element (internal) | None (intrusive) |
| Multi-Consumer | No | No | Yes | No (batch only) |
| Language | C99 | C/C++ | C++11 | C (kernel) |
Key Differentiator: xbase's MPSC queue is minimal and intrusive — zero allocation overhead, wait-free push, and carefully chosen memory orderings. It's designed specifically for the single-consumer patterns found in event loops and timer systems.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbase/mpsc_bench.cpp
| Benchmark | Time (ns) | CPU (ns) | Iterations | Throughput |
|---|---|---|---|---|
BM_Mpsc_SingleProducer | 3,712 | 3,712 | 187,897 | 275.9 M items/s |
BM_Mpsc_MultiProducer/2 | 609,432 | 87,797 | 8,075 | 227.8 M items/s |
BM_Mpsc_MultiProducer/4 | 1,327,965 | 148,356 | 4,768 | 269.6 M items/s |
BM_Mpsc_MultiProducer/8 | 4,466,805 | 292,260 | 1,000 | 273.7 M items/s |
Key Observations:
- Single-producer push/pop achieves ~276M items/s, demonstrating the minimal overhead of the lock-free algorithm.
- Multi-producer scaling maintains ~270M items/s aggregate throughput even with 8 concurrent producers, showing excellent scalability. The wall-clock time increases due to thread synchronization overhead, but per-CPU throughput remains stable.
- The gap between wall-clock time and CPU time in multi-producer benchmarks reflects the cost of thread creation and barrier synchronization, not the queue operations themselves.
atomic.h — Atomic Operations
Introduction
atomic.h provides a set of macro wrappers over GCC/Clang __atomic builtins, offering portable atomic operations with explicit memory ordering. These macros are used throughout xbase for reference counting (memory.h), lock-free queues (mpsc.h), and event loop internals (event.h).
Design Philosophy
-
Thin Macro Wrappers — Each macro maps directly to a compiler builtin with zero overhead. No abstraction layers, no runtime dispatch.
-
Explicit Memory Ordering — Every atomic operation requires an explicit memory order parameter (
xAtomicAcquire,xAtomicRelease, etc.), forcing the programmer to think about ordering requirements rather than defaulting to the expensiveSeqCst. -
GCC/Clang Builtins — The
__atomicbuiltins are supported by GCC ≥ 4.7 and all versions of Clang. They generate optimal instructions for each target architecture (x86:lockprefix, ARM:ldrex/strexor LSE atomics).
Architecture
graph TD
subgraph "xbase Atomic Users"
MEMORY["memory.h<br/>xRetain / xRelease<br/>(SeqCst refcount)"]
MPSC["mpsc.h<br/>xMpscPush / xMpscPop<br/>(AcqRel / Acquire / Release)"]
EVENT["event_private.h<br/>inflight counter<br/>(Relaxed)"]
TASK["task.c<br/>pending / done_count<br/>(stdatomic)"]
end
subgraph "atomic.h Macros"
LOAD["xAtomicLoad"]
STORE["xAtomicStore"]
XCHG["xAtomicXchg"]
CAS["xAtomicCas*"]
ADD["xAtomicAdd/Sub"]
FETCH["xAtomicFetch*"]
end
MEMORY --> ADD
MPSC --> XCHG
MPSC --> LOAD
MPSC --> STORE
MPSC --> CAS
EVENT --> FETCH
style MEMORY fill:#4a90d9,color:#fff
style MPSC fill:#f5a623,color:#fff
style EVENT fill:#50b86c,color:#fff
Implementation Details
Memory Order Constants
| Macro | Value | Meaning |
|---|---|---|
xAtomicRelaxed | __ATOMIC_RELAXED | No ordering constraints. Only guarantees atomicity. |
xAtomicConsume | __ATOMIC_CONSUME | Data-dependent ordering (rarely used in practice). |
xAtomicAcquire | __ATOMIC_ACQUIRE | Prevents reads/writes from being reordered before this operation. |
xAtomicRelease | __ATOMIC_RELEASE | Prevents reads/writes from being reordered after this operation. |
xAtomicAcqRel | __ATOMIC_ACQ_REL | Combines Acquire and Release. |
xAtomicSeqCst | __ATOMIC_SEQ_CST | Full sequential consistency. Most expensive. |
Operation Macros
Load / Store
| Macro | Expansion | Description |
|---|---|---|
xAtomicLoad(p, o) | __atomic_load_n(p, o) | Atomically read *p |
xAtomicStore(p, v, o) | __atomic_store_n(p, v, o) | Atomically write v to *p |
Exchange / CAS
| Macro | Expansion | Description |
|---|---|---|
xAtomicXchg(p, v, o) | __atomic_exchange_n(p, v, o) | Atomically swap *p with v, return old value |
xAtomicCasWeak(p, e, d, o) | __atomic_compare_exchange_n(p, e, d, true, o, Relaxed) | Weak CAS (may spuriously fail) |
xAtomicCasStrong(p, e, d, o) | __atomic_compare_exchange_n(p, e, d, false, o, Relaxed) | Strong CAS (no spurious failure) |
Note: Both CAS macros use
xAtomicRelaxedas the failure ordering. The success ordering is specified by theoparameter.
Arithmetic
| Macro | Expansion | Returns |
|---|---|---|
xAtomicAdd(p, v, o) | __atomic_add_fetch(p, v, o) | New value (*p + v) |
xAtomicSub(p, v, o) | __atomic_sub_fetch(p, v, o) | New value (*p - v) |
xAtomicFetchAdd(p, v, o) | __atomic_fetch_add(p, v, o) | Old value (before add) |
xAtomicFetchSub(p, v, o) | __atomic_fetch_sub(p, v, o) | Old value (before sub) |
Bitwise
| Macro | Expansion | Returns |
|---|---|---|
xAtomicAnd(p, v, o) | __atomic_and_fetch(p, v, o) | New value |
xAtomicOr(p, v, o) | __atomic_or_fetch(p, v, o) | New value |
xAtomicXor(p, v, o) | __atomic_xor_fetch(p, v, o) | New value |
xAtomicNand(p, v, o) | __atomic_nand_fetch(p, v, o) | New value |
xAtomicFetchAnd(p, v, o) | __atomic_fetch_and(p, v, o) | Old value |
xAtomicFetchOr(p, v, o) | __atomic_fetch_or(p, v, o) | Old value |
xAtomicFetchXor(p, v, o) | __atomic_fetch_xor(p, v, o) | Old value |
API Reference
See the Operation Macros section above for the complete list. All macros are defined in <xbase/atomic.h> and require no function calls — they expand directly to compiler builtins.
Usage Examples
Atomic Counter
#include <stdio.h>
#include <pthread.h>
#include <xbase/atomic.h>
static int g_counter = 0;
static void *increment(void *arg) {
(void)arg;
for (int i = 0; i < 100000; i++) {
xAtomicAdd(&g_counter, 1, xAtomicRelaxed);
}
return NULL;
}
int main(void) {
pthread_t threads[4];
for (int i = 0; i < 4; i++)
pthread_create(&threads[i], NULL, increment, NULL);
for (int i = 0; i < 4; i++)
pthread_join(threads[i], NULL);
printf("Counter: %d\n", xAtomicLoad(&g_counter, xAtomicRelaxed));
// Output: Counter: 400000
return 0;
}
Spinlock (Educational)
#include <xbase/atomic.h>
typedef struct { int locked; } Spinlock;
static inline void spin_lock(Spinlock *s) {
while (xAtomicXchg(&s->locked, 1, xAtomicAcquire) != 0) {
// Spin
}
}
static inline void spin_unlock(Spinlock *s) {
xAtomicStore(&s->locked, 0, xAtomicRelease);
}
Use Cases
-
Reference Counting —
memory.husesxAtomicAdd/xAtomicSubwithSeqCstordering for thread-safe reference count management. -
Lock-Free Data Structures —
mpsc.husesxAtomicXchgfor wait-free push andxAtomicCasStrongfor the single-element pop edge case. -
Event Loop Internals — The event loop uses
xAtomicFetchAdd/xAtomicFetchSubwithRelaxedordering to track in-flight offload workers.
Best Practices
- Use the weakest sufficient ordering.
Relaxedfor simple counters,Acquire/Releasefor producer-consumer patterns,SeqCstonly when you need a total order visible to all threads. - Prefer
xAtomicCasStrongoverxAtomicCasWeakunless you're in a retry loop where spurious failures are acceptable (e.g., lock-free stack push). - Note the CAS failure ordering. Both CAS macros hardcode
xAtomicRelaxedas the failure ordering. If you need stronger failure ordering, use the rawxAtomicCasmacro directly. - Don't mix with C11
<stdatomic.h>. While both use the same underlying compiler builtins, mixing the two styles in the same translation unit can be confusing. xbase uses<stdatomic.h>intask.cforatomic_size_tbutatomic.hmacros everywhere else.
Comparison with Other Libraries
| Feature | xbase atomic.h | C11 <stdatomic.h> | C++ <atomic> | Linux kernel atomics |
|---|---|---|---|---|
| Style | Macros over __atomic builtins | Language-level types | Template class | Inline functions + asm |
| Memory Order | Explicit parameter | Explicit parameter | Explicit parameter | Implicit (varies) |
| Types | Any scalar (via pointer) | _Atomic qualified types | std::atomic<T> | atomic_t, atomic64_t |
| CAS | xAtomicCasWeak/Strong | atomic_compare_exchange_* | compare_exchange_* | cmpxchg |
| Compiler | GCC ≥ 4.7, Clang | C11 | C++11 | GCC (kernel) |
| Portability | GCC/Clang only | Standard C11 | Standard C++11 | Linux kernel only |
Key Differentiator: xbase's atomic macros are the thinnest possible wrapper — they add naming consistency (xAtomic* prefix) and explicit ordering parameters without any abstraction overhead. They work with any scalar type via pointer, unlike C11's _Atomic qualifier which requires type annotations.
log.h — Thread-Local Log Callback
Introduction
log.h provides a per-thread, callback-based logging mechanism for xKit's internal error reporting. Each thread can register its own log callback via xLogSetCallback(); when xLog() is called, the formatted message is dispatched to that callback. If no callback is registered, messages fall back to stderr. On fatal errors, a stack backtrace is captured and abort() is called.
Design Philosophy
-
Thread-Local Callbacks — Each thread has its own log callback and userdata, stored in
__thread(thread-local storage). This avoids global locks and allows different threads to route log messages to different destinations (e.g., the xlog async logger, a test harness, or a custom handler). -
Minimal and Non-Allocating —
xLog()formats into a fixed-size thread-local buffer (XLOG_BUF_SIZE, default 512 bytes). No heap allocation occurs during logging, making it safe to call from low-level code paths. -
Fatal with Backtrace — When
fatal = true,xLog()captures a stack trace viaxBacktrace()before callingabort(). This provides immediate diagnostic information for unrecoverable errors. -
Bridge to xlog — The callback mechanism is designed to integrate with the higher-level
xlogmodule. The xlog logger registers itself as the thread's log callback, so internal xKit errors are automatically routed through the async logging pipeline.
Architecture
graph TD
subgraph "Thread 1"
LOG1["xLog()"] --> CB1["Custom Callback"]
end
subgraph "Thread 2"
LOG2["xLog()"] --> CB2["xlog Logger"]
end
subgraph "Thread 3 (no callback)"
LOG3["xLog()"] --> STDERR["stderr"]
end
CB1 --> FILE["Log File"]
CB2 --> XLOG["Async Logger Pipeline"]
style LOG1 fill:#4a90d9,color:#fff
style LOG2 fill:#4a90d9,color:#fff
style LOG3 fill:#4a90d9,color:#fff
Implementation Details
Thread-Local State
XDEF_STRUCT(xLogCtx) {
xLogCallback cb; // User callback (NULL = stderr fallback)
void *userdata; // Forwarded to callback
char buf[XLOG_BUF_SIZE]; // Format buffer (512 bytes)
char bt[XLOG_BT_SIZE]; // Backtrace buffer (2048 bytes)
};
static __thread xLogCtx tl_ctx;
Each thread gets ~2.5 KB of thread-local storage for logging. The buffers are reused across calls, so there's no allocation overhead.
xLog() Flow
flowchart TD
CALL["xLog(fatal, fmt, ...)"]
FMT["vsnprintf → tl_ctx.buf"]
CHECK_FATAL{"fatal?"}
BT["xBacktraceSkip(2, bt, size)"]
CHECK_CB{"callback set?"}
CB["cb(msg, backtrace, userdata)"]
STDERR["fprintf(stderr, msg)"]
ABORT["abort()"]
CALL --> FMT
FMT --> CHECK_FATAL
CHECK_FATAL -->|Yes| BT
CHECK_FATAL -->|No| CHECK_CB
BT --> CHECK_CB
CHECK_CB -->|Yes| CB
CHECK_CB -->|No| STDERR
CB --> CHECK_FATAL2{"fatal?"}
STDERR --> CHECK_FATAL2
CHECK_FATAL2 -->|Yes| ABORT
CHECK_FATAL2 -->|No| DONE["Return"]
style ABORT fill:#e74c3c,color:#fff
style DONE fill:#50b86c,color:#fff
Buffer Size Configuration
The format buffer size can be overridden at compile time:
#define XLOG_BUF_SIZE 1024 // Must be defined before #include <xbase/log.h>
#include <xbase/log.h>
API Reference
Macros
| Macro | Default | Description |
|---|---|---|
XLOG_BUF_SIZE | 512 | Format buffer size in bytes. Override before including the header. |
Types
| Type | Description |
|---|---|
xLogCallback | void (*)(const char *msg, const char *backtrace, void *userdata) — Log callback. backtrace is non-NULL only on fatal. |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xLogSetCallback | void xLogSetCallback(xLogCallback cb, void *userdata) | Register (or clear with NULL) the current thread's log callback. | Thread-local (each thread sets its own) |
xLog | void xLog(bool fatal, const char *fmt, ...) | Format and dispatch a log message. If fatal, captures backtrace and calls abort(). | Thread-local (uses calling thread's callback) |
Usage Examples
Basic Logging with Custom Callback
#include <stdio.h>
#include <xbase/log.h>
static void my_log_handler(const char *msg, const char *backtrace,
void *userdata) {
FILE *f = (FILE *)userdata;
fprintf(f, "[MyApp] %s\n", msg);
if (backtrace) {
fprintf(f, "Stack trace:\n%s", backtrace);
}
}
int main(void) {
// Route this thread's logs to a file
FILE *logfile = fopen("app.log", "w");
xLogSetCallback(my_log_handler, logfile);
xLog(false, "Application started, version %d.%d", 1, 0);
xLog(false, "Processing %d items", 42);
// Clear callback (revert to stderr)
xLogSetCallback(NULL, NULL);
xLog(false, "This goes to stderr");
fclose(logfile);
return 0;
}
Fatal Error with Backtrace
#include <xbase/log.h>
void dangerous_operation(void) {
// This will print the message, capture a backtrace, and abort()
xLog(true, "Unrecoverable error: corrupted state detected");
// Never reaches here
}
Use Cases
-
xKit Internal Error Reporting — All xKit modules use
xLog()to report internal errors (e.g., allocation failures, invalid states). By registering a callback, applications can capture these messages in their logging pipeline. -
xlog Integration — The
xlogmodule registers its logger as the thread's callback viaxLogSetCallback(), routing all internal xKit messages through the async logging system. -
Test Frameworks — Test harnesses can register a callback that captures log messages for assertion, rather than letting them go to stderr.
Best Practices
- Register callbacks early. Set up
xLogSetCallback()before calling any xKit functions to ensure all messages are captured. - Don't block in callbacks. The callback runs synchronously on the calling thread. Blocking delays the caller. For async logging, use the xlog module.
- Handle NULL backtrace. The
backtraceparameter is NULL for non-fatal messages. Always check before using it. - Be aware of buffer truncation. Messages longer than
XLOG_BUF_SIZEare truncated. Increase the size at compile time if needed.
Comparison with Other Libraries
| Feature | xbase log.h | syslog | fprintf(stderr) | GLib g_log |
|---|---|---|---|---|
| Callback | Per-thread | Global handler | N/A | Global handler |
| Thread Safety | Thread-local (no locks) | Thread-safe (kernel) | Thread-safe (stdio lock) | Thread-safe (global lock) |
| Backtrace | Built-in on fatal | No | No | Optional (G_DEBUG) |
| Allocation | None (stack buffer) | None (kernel) | None (stdio buffer) | Heap (GString) |
| Fatal Handling | abort() with backtrace | N/A | N/A | abort() (G_LOG_FLAG_FATAL) |
| Customization | Per-thread callback | openlog() | Redirect fd | g_log_set_handler() |
Key Differentiator: xbase's log is designed as a lightweight internal error channel, not a full logging framework. Its per-thread callback design avoids global locks and integrates naturally with the xlog async logger for production use.
backtrace.h — Platform-Adaptive Stack Backtrace
Introduction
backtrace.h captures the current call stack and formats it into a human-readable multi-line string. The unwinding backend is selected at build time with the following priority: libunwind > execinfo (macOS/glibc) > stub (unsupported platforms). It is used internally by xLog() to provide stack traces on fatal errors.
Design Philosophy
-
Build-Time Backend Selection — The backend is chosen via CMake-detected macros (
XK_HAS_LIBUNWIND,XK_HAS_EXECINFO). This avoids runtime overhead and ensures the best available unwinder is used on each platform. -
Graceful Degradation — On platforms without libunwind or execinfo, a stub backend returns a "not supported" message rather than crashing. This ensures
xBacktrace()is always safe to call. -
Automatic Frame Skipping — Internal frames (
xBacktrace→xBacktraceSkip→bt_capture) are automatically skipped so the output starts from the caller's perspective. Theskipparameter allows additional frames to be skipped (useful when called through wrapper functions likexLog). -
Buffer-Based Output — The caller provides a buffer; no heap allocation occurs. This makes it safe to call from signal handlers, fatal error paths, and low-memory situations.
Architecture
graph TD
API["xBacktrace() / xBacktraceSkip()"]
SELECT{"Build-time selection"}
LIBUNWIND["libunwind<br/>unw_step() loop"]
EXECINFO["execinfo<br/>backtrace() + backtrace_symbols()"]
STUB["stub<br/>'not supported' message"]
BUF["User buffer<br/>(formatted output)"]
API --> SELECT
SELECT -->|XK_HAS_LIBUNWIND| LIBUNWIND
SELECT -->|XK_HAS_EXECINFO| EXECINFO
SELECT -->|fallback| STUB
LIBUNWIND --> BUF
EXECINFO --> BUF
STUB --> BUF
style LIBUNWIND fill:#50b86c,color:#fff
style EXECINFO fill:#4a90d9,color:#fff
style STUB fill:#f5a623,color:#fff
Implementation Details
Backend Selection
| Backend | Macro | Platform | Quality |
|---|---|---|---|
| libunwind | XK_HAS_LIBUNWIND | Linux (with libunwind installed) | Best — accurate unwinding, symbol + offset |
| execinfo | XK_HAS_EXECINFO | macOS, Linux (glibc) | Good — requires -rdynamic on Linux for symbols |
| stub | (fallback) | Any | Minimal — returns "not supported" message |
Output Format
Each frame is formatted as:
#0 0x7fff8a1b2c3d symbol_name+0x1a
#1 0x7fff8a1b2c3d another_function+0x42
#2 0x7fff8a1b2c3d <unknown>
#N— Frame number (0 = most recent)0xADDR— Instruction pointer addresssymbol+offset— Function name and offset (if available)<unknown>— When symbol resolution fails
Frame Skipping
Call stack:
bt_capture() ← INTERNAL_SKIP (2 frames)
xBacktraceSkip() ← INTERNAL_SKIP
xLog() ← user skip = 2 (from xLog)
user_function() ← first visible frame
main()
xBacktrace() calls xBacktraceSkip(0, ...), which adds INTERNAL_SKIP = 2 to skip its own frames. xLog() calls xBacktraceSkip(2, ...) to also skip xLog and xLogSetCallback frames.
libunwind Backend
Uses unw_getcontext() → unw_init_local() → unw_step() loop. For each frame:
unw_get_reg(UNW_REG_IP)— Get instruction pointerunw_get_proc_name()— Get symbol name and offset
execinfo Backend
Uses backtrace() to capture frame addresses, then backtrace_symbols() to resolve names. On Linux, link with -rdynamic to export symbols for resolution.
API Reference
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xBacktrace | int xBacktrace(char *buf, size_t size) | Capture the call stack into buf. Equivalent to xBacktraceSkip(0, buf, size). | Thread-safe (uses only local/stack state) |
xBacktraceSkip | int xBacktraceSkip(int skip, char *buf, size_t size) | Capture the call stack, skipping skip additional frames beyond internal frames. | Thread-safe |
Parameters
| Parameter | Description |
|---|---|
skip | Number of additional frames to skip (0 = no extra skipping) |
buf | Destination buffer. May be NULL (returns 0). |
size | Size of buf in bytes. |
Return Value
Number of bytes written (excluding trailing \0), or 0 if buf is NULL or size is 0.
Usage Examples
Capture and Print Stack Trace
#include <stdio.h>
#include <xbase/backtrace.h>
void foo(void) {
char buf[4096];
int n = xBacktrace(buf, sizeof(buf));
if (n > 0) {
printf("Stack trace:\n%s", buf);
}
}
void bar(void) { foo(); }
int main(void) {
bar();
return 0;
}
Output (with execinfo on macOS):
Stack trace:
#0 0x100003f20 foo+0x20
#1 0x100003f80 bar+0x10
#2 0x100003fa0 main+0x10
Skip Wrapper Frames
#include <xbase/backtrace.h>
// Custom error reporter that skips its own frame
void report_error(const char *msg) {
char bt[2048];
xBacktraceSkip(1, bt, sizeof(bt)); // Skip report_error itself
fprintf(stderr, "Error: %s\nBacktrace:\n%s", msg, bt);
}
Use Cases
-
Fatal Error Diagnostics —
xLog()captures a backtrace on fatal errors, providing immediate context for debugging crashes. -
Debug Assertions — Custom assertion macros can include
xBacktrace()to show where the assertion failed. -
Memory Leak Detection — Record allocation backtraces to identify where leaked objects were created.
Best Practices
- Provide a large enough buffer. 4096 bytes is usually sufficient for 20-30 frames. The output is truncated (not corrupted) if the buffer is too small.
- Link with
-rdynamicon Linux. Without it, the execinfo backend shows only addresses, not symbol names. - Install libunwind for best results on Linux. It provides more accurate unwinding than execinfo, especially through optimized code and signal handlers.
- Don't call from signal handlers with execinfo.
backtrace_symbols()callsmalloc(), which is not async-signal-safe. libunwind is safer in this context.
Comparison with Other Libraries
| Feature | xbase backtrace.h | glibc backtrace() | libunwind | Boost.Stacktrace | Windows CaptureStackBackTrace |
|---|---|---|---|---|---|
| Platform | macOS + Linux + stub | Linux (glibc) | Linux + macOS | Cross-platform | Windows |
| Accuracy | Backend-dependent | Good (glibc) | Excellent | Backend-dependent | Good |
| Symbol Resolution | Built-in | backtrace_symbols() | unw_get_proc_name() | Backend-dependent | SymFromAddr() |
| Allocation | None (user buffer) | malloc() for symbols | None | Heap | None |
| Signal Safety | libunwind: yes, execinfo: no | No (malloc) | Yes | No | Yes |
| Frame Skipping | Built-in (skip param) | Manual | Manual | Manual | FramesToSkip param |
Key Differentiator: xbase's backtrace provides a simple, buffer-based API with automatic frame skipping and graceful degradation across platforms. It's designed for integration into error reporting paths where heap allocation is undesirable.
socket.h — Async Socket
Introduction
socket.h provides an async socket abstraction built on top of xEventLoop. It wraps the POSIX socket API with automatic non-blocking setup, event loop registration, and idle-timeout support. When a socket becomes readable, writable, or times out, a single unified callback is invoked with the appropriate event mask.
Design Philosophy
-
Thin Wrapper, Not a Framework —
xSocketadds just enough abstraction to eliminate boilerplate (non-blocking setup,FD_CLOEXEC, event registration) without hiding the underlying fd. You can always retrieve the raw fd viaxSocketFd()for direct system calls. -
Idle-Timeout Semantics — Read and write timeouts are reset on every corresponding I/O event, implementing idle-timeout behavior. This is ideal for detecting dead connections: if no data arrives within the timeout period, the callback fires with
xEvent_Timeout. -
Unified Callback — A single
xSocketFunccallback handles all events (read, write, timeout). Themaskparameter tells you what happened, and thexEvent_Timeoutflag is OR'd withxEvent_ReadorxEvent_Writeto indicate which direction timed out. -
Lifecycle Tied to Event Loop — A socket is created and destroyed in the context of an event loop.
xSocketDestroy()cancels timers, removes the event source, closes the fd, and frees the handle in one call.
Architecture
graph TD
APP["Application"] -->|"xSocketCreate()"| SOCKET["xSocket"]
SOCKET -->|"xEventAdd()"| LOOP["xEventLoop"]
LOOP -->|"I/O ready"| TRAMP["trampoline()"]
TRAMP -->|"reset timers"| TIMER["Timer Heap"]
TRAMP -->|"forward"| CB["callback(sock, mask, userp)"]
TIMER -->|"timeout"| TIMEOUT_CB["timeout_cb()"]
TIMEOUT_CB -->|"xEvent_Timeout"| CB
style SOCKET fill:#4a90d9,color:#fff
style LOOP fill:#f5a623,color:#fff
style CB fill:#50b86c,color:#fff
Implementation Details
Internal Structure
struct xSocket_ {
int fd; // Underlying file descriptor
xEventLoop loop; // Bound event loop
xEventSource source; // Registered event source
xEventMask mask; // Current event mask
xSocketFunc callback; // User callback
void *userp; // User data
xEventTimer read_timer; // Read idle timeout timer
xEventTimer write_timer; // Write idle timeout timer
int read_timeout_ms; // Read timeout setting (0 = disabled)
int write_timeout_ms; // Write timeout setting (0 = disabled)
};
Trampoline Pattern
The socket registers an internal trampoline() function as the event callback with the event loop. This trampoline:
- Resets idle timers — On
xEvent_Read, cancels and re-arms the read timer. OnxEvent_Write, cancels and re-arms the write timer. - Forwards to user callback — Calls
callback(sock, mask, userp)with the original event mask.
This ensures idle timers are always reset transparently, without requiring the user to manage them manually.
Socket Creation
xSocketCreate() performs these steps atomically:
socket(family, type, protocol)— On Linux/BSD withSOCK_CLOEXEC | SOCK_NONBLOCK, both flags are set in one syscall. On other platforms,fcntl()is used as a fallback.xEventAdd(loop, fd, mask, trampoline, socket)— Registers with the event loop.- Returns the opaque
xSockethandle.
Timeout Mechanism
sequenceDiagram
participant App
participant Socket as xSocket
participant L as xEventLoop
participant Timer as Timer Heap
App->>Socket: xSocketSetTimeout(sock, 5000, 3000)
Socket->>Timer: arm read timer (5s)
Socket->>Timer: arm write timer (3s)
Note over L: Data arrives on fd
L->>Socket: trampoline(fd, xEvent_Read)
Socket->>Timer: cancel + re-arm read timer (5s)
Socket->>App: callback(sock, xEvent_Read)
Note over Timer: 5 seconds of silence...
Timer->>Socket: read_timeout_cb()
Socket->>App: callback(sock, xEvent_Timeout | xEvent_Read)
API Reference
Types
| Type | Description |
|---|---|
xSocket | Opaque handle to an async socket |
xSocketFunc | void (*)(xSocket sock, xEventMask mask, void *arg) — Socket event callback |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xSocketCreate | xSocket xSocketCreate(xEventLoop loop, int family, int type, int protocol, xEventMask mask, xSocketFunc callback, void *userp) | Create a non-blocking socket and register with the event loop. | Not thread-safe |
xSocketDestroy | void xSocketDestroy(xEventLoop loop, xSocket sock) | Cancel timers, remove from event loop, close fd, free handle. Safe with NULL. | Not thread-safe |
xSocketSetMask | xErrno xSocketSetMask(xEventLoop loop, xSocket sock, xEventMask mask) | Change the watched event mask. | Not thread-safe |
xSocketSetTimeout | xErrno xSocketSetTimeout(xSocket sock, int read_timeout_ms, int write_timeout_ms) | Set idle timeouts. Pass 0 to cancel. Replaces previous settings. | Not thread-safe |
xSocketFd | int xSocketFd(xSocket sock) | Return the underlying fd, or -1 if NULL. | Thread-safe (read-only) |
xSocketMask | xEventMask xSocketMask(xSocket sock) | Return the current event mask, or 0 if NULL. | Thread-safe (read-only) |
Callback Mask Values
| Mask | Meaning |
|---|---|
xEvent_Read | Socket is readable |
xEvent_Write | Socket is writable |
xEvent_Timeout | xEvent_Read | Read idle timeout fired |
xEvent_Timeout | xEvent_Write | Write idle timeout fired |
Usage Examples
TCP Echo Client with Timeout
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <xbase/socket.h>
static xEventLoop g_loop;
static void on_socket(xSocket sock, xEventMask mask, void *arg) {
(void)arg;
if (mask & xEvent_Timeout) {
printf("Timeout on %s\n",
(mask & xEvent_Read) ? "read" : "write");
xSocketDestroy(g_loop, sock);
xEventLoopStop(g_loop);
return;
}
if (mask & xEvent_Read) {
char buf[1024];
ssize_t n;
while ((n = read(xSocketFd(sock), buf, sizeof(buf))) > 0) {
printf("Received: %.*s\n", (int)n, buf);
}
}
if (mask & xEvent_Write) {
const char *msg = "Hello, server!";
write(xSocketFd(sock), msg, strlen(msg));
// Switch to read-only after sending
xSocketSetMask(g_loop, sock, xEvent_Read);
}
}
int main(void) {
g_loop = xEventLoopCreate();
xSocket sock = xSocketCreate(g_loop, AF_INET, SOCK_STREAM, 0,
xEvent_Write, on_socket, NULL);
if (!sock) return 1;
// Set 5-second read idle timeout
xSocketSetTimeout(sock, 5000, 0);
// Connect (non-blocking)
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(8080),
};
inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
connect(xSocketFd(sock), (struct sockaddr *)&addr, sizeof(addr));
xEventLoopRun(g_loop);
xEventLoopDestroy(g_loop);
return 0;
}
UDP Receiver with Idle Timeout
#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <xbase/socket.h>
static void on_udp(xSocket sock, xEventMask mask, void *arg) {
xEventLoop loop = (xEventLoop)arg;
if (mask & xEvent_Timeout) {
printf("No data for 10 seconds, shutting down.\n");
xSocketDestroy(loop, sock);
xEventLoopStop(loop);
return;
}
if (mask & xEvent_Read) {
char buf[65536];
ssize_t n;
while ((n = read(xSocketFd(sock), buf, sizeof(buf))) > 0) {
printf("UDP: %.*s\n", (int)n, buf);
}
}
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xSocket sock = xSocketCreate(loop, AF_INET, SOCK_DGRAM, 0,
xEvent_Read, on_udp, loop);
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(9999),
.sin_addr.s_addr = INADDR_ANY,
};
bind(xSocketFd(sock), (struct sockaddr *)&addr, sizeof(addr));
// 10-second read idle timeout
xSocketSetTimeout(sock, 10000, 0);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Use Cases
-
Network Servers — Create listening sockets, accept connections, and manage each client with its own
xSocket+ idle timeout. Dead connections are automatically detected. -
Protocol Clients — Build async clients (HTTP, Redis, etc.) that connect, send requests, and wait for responses with timeout protection.
-
Real-Time Data Feeds — Monitor UDP multicast sockets with idle timeouts to detect feed outages.
Best Practices
- Always drain in edge-triggered mode. Since the underlying event loop is edge-triggered, read/write until
EAGAINin every callback. - Use idle timeouts for connection health. Set
read_timeout_msto detect dead peers. The timeout resets automatically on each read event. - Destroy sockets before the event loop.
xSocketDestroy()callsxEventDel()andxEventLoopTimerCancel(), which require a valid event loop. - Check the timeout direction. When
xEvent_Timeoutfires, checkmask & xEvent_Readvs.mask & xEvent_Writeto know which direction timed out. - Don't close the fd manually.
xSocketDestroy()closes it for you. Closing it separately leads to double-close bugs.
Comparison with Other Libraries
| Feature | xbase socket.h | POSIX socket API | libuv uv_tcp_t | Boost.Asio |
|---|---|---|---|---|
| Non-blocking Setup | Automatic (SOCK_NONBLOCK + FD_CLOEXEC) | Manual (fcntl) | Automatic | Automatic |
| Event Registration | Automatic (via xEventLoop) | Manual (epoll_ctl / kevent) | Automatic | Automatic |
| Idle Timeout | Built-in (xSocketSetTimeout) | Manual (timer + bookkeeping) | Manual (uv_timer) | Manual (deadline_timer) |
| Callback Style | Single unified callback with mask | N/A (blocking or manual poll) | Separate read/write callbacks | Separate handlers |
| Raw fd Access | xSocketFd() | Direct | uv_fileno() | native_handle() |
| Buffered I/O | No (raw fd) | No | Yes (uv_read_start) | Yes (async_read) |
| Platform | macOS + Linux | POSIX | Cross-platform | Cross-platform |
Key Differentiator: xbase's socket abstraction is intentionally thin — it handles the boilerplate (non-blocking, event registration, idle timeout) but leaves data reading/writing to the caller via the raw fd. This gives maximum flexibility without imposing a buffering strategy.
io.h — Abstract I/O Interfaces
Introduction
io.h defines four lightweight I/O interfaces — xReader, xWriter, xSeeker, xCloser — inspired by Go's io.Reader / io.Writer / io.Seeker / io.Closer. Each interface is a small struct containing a function pointer and an opaque void *ctx, making it trivial to adapt any object that provides the matching function signature.
On top of these interfaces, io.h provides a set of convenience functions (xRead, xReadFull, xReadAll, xWrite, xWritev, xSeek, xClose) that operate generically on any implementation, enabling code reuse across TCP connections, TLS streams, file descriptors, in-memory buffers, and more.
Design Philosophy
-
Value-Type Interfaces — Each interface is a plain struct (function pointer + context), not a heap-allocated object. They are cheap to copy, pass by value, and require no memory management.
-
POSIX Semantics — Function signatures mirror their POSIX counterparts:
read(2),writev(2),lseek(2),close(2). This makes the learning curve near-zero for C developers. -
Composable Helpers — Higher-level functions like
xReadFullandxReadAllare built on top ofxReader, so any object that provides a reader automatically gains these capabilities. -
Zero-Initialized = Invalid — A zero-initialized struct (all NULL) is treated as "not set". Convenience functions can detect this and return an error instead of crashing.
Architecture
graph TD
subgraph "Interfaces"
R["xReader<br/>ssize_t read(ctx, buf, len)"]
W["xWriter<br/>ssize_t writev(ctx, iov, iovcnt)"]
S["xSeeker<br/>off_t seek(ctx, offset, whence)"]
C["xCloser<br/>int close(ctx)"]
end
subgraph "Convenience Functions"
XR["xRead"]
XRF["xReadFull"]
XRA["xReadAll"]
XW["xWrite"]
XWV["xWritev"]
XS["xSeek"]
XC["xClose"]
end
subgraph "Implementations"
TCP["xTcpConn<br/>xTcpConnReader / xTcpConnWriter"]
IOB["xIOBuffer<br/>(read/writev funcs)"]
FD["File Descriptor<br/>(custom wrapper)"]
end
XR --> R
XRF --> R
XRA --> R
XW --> W
XWV --> W
XS --> S
XC --> C
TCP -.->|"adapts to"| R
TCP -.->|"adapts to"| W
IOB -.->|"adapts to"| R
IOB -.->|"adapts to"| W
FD -.->|"adapts to"| R
FD -.->|"adapts to"| W
style R fill:#4a90d9,color:#fff
style W fill:#4a90d9,color:#fff
style S fill:#4a90d9,color:#fff
style C fill:#4a90d9,color:#fff
style XRF fill:#50b86c,color:#fff
style XRA fill:#50b86c,color:#fff
Implementation Details
Interface Structs
Each interface is a two-field struct:
| Interface | Function Pointer | Semantics |
|---|---|---|
xReader | ssize_t (*read)(void *ctx, void *buf, size_t len) | Returns bytes read, 0 on EOF, -1 on error |
xWriter | ssize_t (*writev)(void *ctx, const struct iovec *iov, int iovcnt) | Returns bytes written, -1 on error |
xSeeker | off_t (*seek)(void *ctx, off_t offset, int whence) | Returns resulting offset, -1 on error |
xCloser | int (*close)(void *ctx) | Returns 0 on success, -1 on failure |
xReadFull — Retry Logic
xReadFull loops calling r.read until exactly len bytes are read or EOF is reached. It automatically retries on EAGAIN and EINTR, making it suitable for both blocking and non-blocking file descriptors:
while (total < len):
n = r.read(ctx, buf + total, len - total)
if n > 0: total += n
if n == 0: break // EOF
if n == -1:
if EAGAIN or EINTR: continue
else: return -1 // real error
return total
xReadAll — Dynamic Buffer Growth
xReadAll reads until EOF into a dynamically allocated buffer. It starts with a 4096-byte allocation and doubles the capacity each time the buffer fills up:
cap = 4096, buf = malloc(cap)
loop:
if total == cap: realloc(buf, cap * 2)
n = r.read(ctx, buf + total, cap - total)
if n > 0: total += n
if n == 0: *out = buf, *out_len = total, return 0
if n == -1:
if EAGAIN or EINTR: continue
else: free(buf), return -1
The caller is responsible for freeing the returned buffer with free().
xWrite — Single Buffer Convenience
xWrite wraps a contiguous buffer into a single struct iovec and delegates to w.writev, avoiding the need for callers to construct iovec arrays for simple writes:
ssize_t xWrite(xWriter w, const void *buf, size_t len) {
struct iovec iov = { .iov_base = (void *)buf, .iov_len = len };
return w.writev(w.ctx, &iov, 1);
}
API Reference
Types
| Type | Description |
|---|---|
xReader | Abstract reader — { ssize_t (*read)(void*, void*, size_t), void *ctx } |
xWriter | Abstract writer — { ssize_t (*writev)(void*, const struct iovec*, int), void *ctx } |
xSeeker | Abstract seeker — { off_t (*seek)(void*, off_t, int), void *ctx } |
xCloser | Abstract closer — { int (*close)(void*), void *ctx } |
Functions
| Function | Signature | Description |
|---|---|---|
xRead | ssize_t xRead(xReader r, void *buf, size_t len) | Single read; returns bytes read, 0 on EOF, -1 on error |
xWrite | ssize_t xWrite(xWriter w, const void *buf, size_t len) | Write a contiguous buffer (wraps into single iovec) |
xWritev | ssize_t xWritev(xWriter w, const struct iovec *iov, int iovcnt) | Scatter-gather write |
xSeek | off_t xSeek(xSeeker s, off_t offset, int whence) | Reposition offset (SEEK_SET / SEEK_CUR / SEEK_END) |
xClose | int xClose(xCloser c) | Close the underlying resource |
xReadFull | ssize_t xReadFull(xReader r, void *buf, size_t len) | Read exactly len bytes, retrying on partial reads and EAGAIN/EINTR |
xReadAll | int xReadAll(xReader r, void **out, size_t *out_len) | Read until EOF into a malloc'd buffer; caller must free(*out) |
Usage Examples
Creating a Custom Reader
#include <xbase/io.h>
#include <unistd.h>
// Adapt a file descriptor into an xReader
static ssize_t fd_read(void *ctx, void *buf, size_t len) {
int fd = (int)(intptr_t)ctx;
return read(fd, buf, len);
}
xReader make_fd_reader(int fd) {
xReader r;
r.read = fd_read;
r.ctx = (void *)(intptr_t)fd;
return r;
}
Reading Exactly N Bytes
#include <xbase/io.h>
void read_header(xReader r) {
char header[64];
ssize_t n = xReadFull(r, header, sizeof(header));
if (n < 0) {
// error
} else if ((size_t)n < sizeof(header)) {
// EOF before full header
} else {
// got all 64 bytes
}
}
Reading All Data Until EOF
#include <xbase/io.h>
#include <stdlib.h>
void read_body(xReader r) {
void *data;
size_t data_len;
if (xReadAll(r, &data, &data_len) == 0) {
// process data (data_len bytes at data)
free(data);
} else {
// error
}
}
Using with xTcpConn
xTcpConn (from <xnet/tcp.h>) provides adapter functions that return xReader and xWriter bound to the connection's transport layer. This allows TCP connections to be used with all generic I/O helpers:
#include <xbase/io.h>
#include <xnet/tcp.h>
void handle_connection(xTcpConn conn) {
// Get I/O adapters from the TCP connection
xReader r = xTcpConnReader(conn);
xWriter w = xTcpConnWriter(conn);
// Read a fixed-size header
char header[16];
ssize_t n = xReadFull(r, header, sizeof(header));
if (n < (ssize_t)sizeof(header)) return;
// Read the entire body until the peer closes
void *body;
size_t body_len;
if (xReadAll(r, &body, &body_len) != 0) return;
// Echo back through the generic writer
xWrite(w, body, body_len);
free(body);
}
Scatter-Gather Write
#include <xbase/io.h>
void send_http_response(xWriter w) {
const char *header = "HTTP/1.1 200 OK\r\nContent-Length: 5\r\n\r\n";
const char *body = "Hello";
struct iovec iov[2] = {
{ .iov_base = (void *)header, .iov_len = strlen(header) },
{ .iov_base = (void *)body, .iov_len = 5 },
};
xWritev(w, iov, 2);
}
Integration with xTcpConn
xTcpConn provides two adapter functions that bridge the TCP connection to the generic I/O interfaces:
| Function | Returns | Description |
|---|---|---|
xTcpConnReader(conn) | xReader | Reader bound to transport.read — equivalent to xTcpConnRecv |
xTcpConnWriter(conn) | xWriter | Writer bound to transport.writev — equivalent to xTcpConnSendIov |
These adapters are zero-allocation: they copy the function pointer and context from the connection's internal xTransport into a stack-allocated struct. The returned interfaces are valid as long as the connection (and its transport) remains alive.
Why no xCloser adapter? xTcpConnClose() requires an xEventLoop parameter to properly unregister the socket from the event loop, which does not fit the int (*close)(void *ctx) signature.
Best Practices
- Prefer
xReadFullover manual loops when you need an exact number of bytes. It handlesEAGAIN,EINTR, and partial reads correctly. - Always
free()the buffer fromxReadAllon success. On error, the function cleans up internally. - Use
xWritefor simple writes,xWritevfor multi-buffer writes.xWriteis a thin wrapper that constructs a single iovec — no performance penalty. - Check for zero-initialized interfaces before passing them to helpers. If
xTcpConnReader(NULL)returns a zero struct, callingxReadon it will dereference a NULL function pointer. - Obtain adapters once, use many times. Since
xTcpConnReader/xTcpConnWriterare value types, you can call them once at the start of a handler and reuse the result throughout.
Comparison with Other Libraries
| Feature | xbase io.h | Go io.Reader/Writer | POSIX read/write | C++ std::iostream |
|---|---|---|---|---|
| Abstraction | Struct (fn ptr + ctx) | Interface (vtable) | Raw syscall | Class hierarchy |
| Allocation | Zero (stack value) | Heap (interface value) | N/A | Heap (stream object) |
| Composability | Via helper functions | Via io.Copy, io.ReadAll, etc. | Manual loops | Via stream operators |
| Scatter-Gather | Built-in (xWritev) | No (use io.MultiWriter) | writev(2) | No |
| Read-Until-EOF | xReadAll (malloc'd buffer) | io.ReadAll ([]byte) | Manual loop | std::istreambuf_iterator |
| Error Model | Return value (-1 + errno) | (n, error) tuple | Return value (-1 + errno) | Stream state flags |
xbuf — Buffer Toolkit
Introduction
xbuf is xKit's buffer module, providing three distinct buffer types optimized for different use cases: a linear auto-growing buffer, a fixed-size ring buffer, and a reference-counted block-chain I/O buffer. Together they cover the full spectrum of buffering needs — from simple byte accumulation to zero-copy network I/O.
Design Philosophy
-
One Buffer Does Not Fit All — Rather than a single "universal" buffer, xbuf offers three specialized types. Each makes different trade-offs between simplicity, performance, and memory efficiency.
-
Flexible Array Member Layout — Both
xBufferandxRingBufferallocate header + data in a singlemalloc()call using C99 flexible array members. This eliminates pointer indirection and improves cache locality. -
Reference-Counted Block Sharing —
xIOBufferuses reference-counted blocks that can be shared across multiple buffers. This enables zero-copy split and append operations critical for high-performance network protocols. -
I/O Integration — All three types provide
ReadFd/WriteFdhelpers that handleEINTRretries and scatter-gather I/O (readv/writev), making them ready for event-driven network programming.
Architecture
graph TD
subgraph "xbuf Module"
BUF["xBuffer<br/>Linear auto-growing<br/>Single contiguous allocation"]
RING["xRingBuffer<br/>Fixed-size circular<br/>Power-of-2 masking"]
IO["xIOBuffer<br/>Block-chain<br/>Reference-counted"]
end
subgraph "Shared Infrastructure"
POOL["Block Pool<br/>Treiber stack freelist"]
ATOMIC["xbase/atomic.h<br/>Lock-free operations"]
end
IO --> POOL
POOL --> ATOMIC
subgraph "I/O Layer"
READ["read() / readv()"]
WRITE["write() / writev()"]
end
BUF --> READ
BUF --> WRITE
RING --> READ
RING --> WRITE
IO --> READ
IO --> WRITE
style BUF fill:#4a90d9,color:#fff
style RING fill:#f5a623,color:#fff
style IO fill:#50b86c,color:#fff
Sub-Module Overview
| Header | Type | Description | Doc |
|---|---|---|---|
buf.h | xBuffer | Linear auto-growing byte buffer with flexible array member layout | buf.md |
ring.h | xRingBuffer | Fixed-size circular buffer with power-of-2 bitmask indexing | ring.md |
io.h | xIOBuffer | Reference-counted block-chain I/O buffer with zero-copy operations | io.md |
How to Choose
| Criterion | xBuffer | xRingBuffer | xIOBuffer |
|---|---|---|---|
| Memory layout | Contiguous | Contiguous (circular) | Non-contiguous (block chain) |
| Growth | Auto-growing (2x realloc) | Fixed size (never grows) | Auto-growing (new blocks) |
| Best for | Accumulating variable-length data | Fixed-capacity producer-consumer | High-throughput network I/O |
| Zero-copy split | No | No | Yes |
| Zero-copy append | No | No | Yes (between xIOBuffers) |
| Scatter-gather I/O | No (single buffer) | Yes (up to 2 iovecs) | Yes (N iovecs) |
| Memory overhead | Minimal (1 allocation) | Minimal (1 allocation) | Per-block overhead + ref array |
| Thread safety | Not thread-safe | Not thread-safe | Block pool is thread-safe |
Decision Guide
Need to accumulate data of unknown size?
→ xBuffer (simple, auto-growing)
Need a fixed-capacity FIFO between producer and consumer?
→ xRingBuffer (no allocation after creation)
Need zero-copy operations or scatter-gather I/O for networking?
→ xIOBuffer (block-chain with reference counting)
Quick Start
#include <stdio.h>
#include <xbuf/buf.h>
#include <xbuf/ring.h>
#include <xbuf/io.h>
int main(void) {
// 1. Linear buffer: accumulate data
xBuffer buf = xBufferCreate(256);
xBufferAppend(&buf, "Hello, ", 7);
xBufferAppend(&buf, "xbuf!", 5);
printf("buf: %.*s\n", (int)xBufferLen(buf), (const char *)xBufferData(buf));
xBufferDestroy(buf);
// 2. Ring buffer: fixed-capacity FIFO
xRingBuffer ring = xRingBufferCreate(1024);
xRingBufferWrite(ring, "circular", 8);
char out[16];
size_t n = xRingBufferRead(ring, out, sizeof(out));
printf("ring: %.*s\n", (int)n, out);
xRingBufferDestroy(ring);
// 3. IO buffer: block-chain with zero-copy
xIOBuffer io;
xIOBufferInit(&io);
xIOBufferAppend(&io, "block-chain I/O", 15);
char linear[64];
xIOBufferCopyTo(&io, linear);
printf("io: %.*s\n", (int)xIOBufferLen(&io), linear);
xIOBufferDeinit(&io);
return 0;
}
Relationship with Other Modules
- xbase —
xIOBufferusesatomic.hfor lock-free block pool management and reference counting. - xhttp — The HTTP client (
client.h) usesxIOBufferfor response body accumulation and SSE stream parsing. - xlog — The async logger (
logger.h) may usexBufferfor log message formatting.
buf.h — Linear Auto-Growing Buffer
Introduction
buf.h provides xBuffer, a simple contiguous byte buffer that automatically grows when more space is needed. It maintains separate read and write positions, supporting efficient append-and-consume patterns. The buffer header and data area are allocated in a single malloc() call using a C99 flexible array member, avoiding an extra pointer indirection.
Design Philosophy
-
Single Allocation — Header and data live in one contiguous block (
struct + flexible array member). This means onemalloc(), onefree(), and excellent cache locality. -
Handle Indirection — Because
realloc()may relocate the entire object, write APIs takexBuffer *bufp(pointer to handle) so the caller's handle stays valid after growth. -
Compact Before Grow — When the buffer needs more space, it first tries to compact (slide unread data to the front) before resorting to
realloc(). This reclaims consumed space without allocation. -
2x Growth — When reallocation is necessary, capacity doubles each time, providing amortized O(1) append.
Architecture
graph LR
subgraph "xBuffer Lifecycle"
CREATE["xBufferCreate(cap)"] --> USE["Append / Read / Consume"]
USE --> GROW{"Need more space?"}
GROW -->|Compact| USE
GROW -->|Realloc 2x| USE
USE --> DESTROY["xBufferDestroy()"]
end
style CREATE fill:#4a90d9,color:#fff
style DESTROY fill:#e74c3c,color:#fff
Implementation Details
Memory Layout
Single malloc() allocation:
┌──────────────────┬──────────────────────────────────────────┐
│ xBuffer_ header │ data[cap] (flexible array member) │
│ rpos, wpos, cap │ │
└──────────────────┴──────────────────────────────────────────┘
↑ ↑ ↑
data+rpos data+wpos data+cap
│←readable→│←────writable──────→│
Internal Structure
XDEF_STRUCT(xBuffer_) {
size_t rpos; // Read position (start of unread data)
size_t wpos; // Write position (end of unread data)
size_t cap; // Total data capacity
char data[]; // Flexible array member
};
Growth Strategy
flowchart TD
APPEND["xBufferAppend(bufp, data, len)"]
CHECK{"wpos + len <= cap?"}
WRITE["memcpy at wpos, advance wpos"]
COMPACT{"rpos > 0 AND<br/>unread + len <= cap?"}
MEMMOVE["memmove data to front<br/>rpos=0, wpos=unread"]
REALLOC["realloc(cap * 2)"]
UPDATE["Update *bufp"]
APPEND --> CHECK
CHECK -->|Yes| WRITE
CHECK -->|No| COMPACT
COMPACT -->|Yes| MEMMOVE --> WRITE
COMPACT -->|No| REALLOC --> UPDATE --> WRITE
style WRITE fill:#50b86c,color:#fff
style REALLOC fill:#f5a623,color:#fff
Operations and Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
xBufferAppend | Amortized O(1) per byte | May trigger compact or realloc |
xBufferConsume | O(1) | Advances read position |
xBufferCompact | O(n) | memmove of unread data |
xBufferData | O(1) | Returns data + rpos |
xBufferLen | O(1) | Returns wpos - rpos |
xBufferReadFd | O(1) | Single read() syscall |
xBufferWriteFd | O(1) | Single write() syscall |
API Reference
Lifecycle
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xBufferCreate | xBuffer xBufferCreate(size_t initial_cap) | Create a buffer. Min capacity is 64. | Not thread-safe |
xBufferDestroy | void xBufferDestroy(xBuffer buf) | Free the buffer. NULL is a no-op. | Not thread-safe |
xBufferReset | void xBufferReset(xBuffer buf) | Discard all data, keep memory. | Not thread-safe |
Write
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xBufferAppend | xErrno xBufferAppend(xBuffer *bufp, const void *data, size_t len) | Append bytes, growing if needed. | Not thread-safe |
xBufferAppendStr | xErrno xBufferAppendStr(xBuffer *bufp, const char *str) | Append a C string (excluding NUL). | Not thread-safe |
xBufferReserve | xErrno xBufferReserve(xBuffer *bufp, size_t additional) | Ensure at least additional writable bytes. | Not thread-safe |
Read
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xBufferData | const void *xBufferData(xBuffer buf) | Pointer to readable data. Valid until next mutation. | Not thread-safe |
xBufferLen | size_t xBufferLen(xBuffer buf) | Number of readable bytes. | Not thread-safe |
xBufferCap | size_t xBufferCap(xBuffer buf) | Total allocated capacity. | Not thread-safe |
xBufferWritable | size_t xBufferWritable(xBuffer buf) | Writable bytes (cap - wpos). | Not thread-safe |
xBufferConsume | void xBufferConsume(xBuffer buf, size_t n) | Advance read position by n bytes. | Not thread-safe |
xBufferCompact | void xBufferCompact(xBuffer buf) | Move unread data to front, maximize writable space. | Not thread-safe |
I/O Helpers
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xBufferReadFd | ssize_t xBufferReadFd(xBuffer *bufp, int fd) | Read from fd into buffer (ensures 4KB space). | Not thread-safe |
xBufferWriteFd | ssize_t xBufferWriteFd(xBuffer buf, int fd) | Write readable data to fd, consume written bytes. | Not thread-safe |
Usage Examples
Basic Append and Read
#include <stdio.h>
#include <xbuf/buf.h>
int main(void) {
xBuffer buf = xBufferCreate(256);
// Append data
xBufferAppend(&buf, "Hello, ", 7);
xBufferAppendStr(&buf, "World!");
// Read data
printf("Content: %.*s\n", (int)xBufferLen(buf),
(const char *)xBufferData(buf));
// Output: Content: Hello, World!
// Consume partial data
xBufferConsume(buf, 7);
printf("After consume: %.*s\n", (int)xBufferLen(buf),
(const char *)xBufferData(buf));
// Output: After consume: World!
// Compact to reclaim consumed space
xBufferCompact(buf);
xBufferDestroy(buf);
return 0;
}
Network I/O
#include <xbuf/buf.h>
#include <unistd.h>
void handle_connection(int sockfd) {
xBuffer buf = xBufferCreate(4096);
// Read from socket
ssize_t n = xBufferReadFd(&buf, sockfd);
if (n > 0) {
// Process data...
// Write response back
xBufferAppendStr(&buf, "HTTP/1.1 200 OK\r\n\r\n");
xBufferWriteFd(buf, sockfd);
}
xBufferDestroy(buf);
}
Use Cases
-
HTTP Response Accumulation — Accumulate response body chunks of unknown total size. The auto-growing behavior handles variable-length responses.
-
Protocol Parsing — Append incoming data, parse complete messages from the front, consume parsed bytes. The compact operation reclaims space without reallocation.
-
Log Message Formatting — Build log messages incrementally with multiple append calls before flushing.
Best Practices
- Always pass
&bufto write APIs. Functions that may grow the buffer takexBuffer *bufpbecauserealloc()may relocate the object. - Call
xBufferCompact()periodically if you consume data incrementally. This avoids unnecessary reallocation by reclaiming consumed space. - Check return values.
xBufferAppend()andxBufferReserve()returnxErrno_NoMemoryon allocation failure. - Don't cache
xBufferData()pointers across mutating calls. Any append/reserve/compact may invalidate the pointer.
Comparison with Other Libraries
| Feature | xbuf buf.h | Go bytes.Buffer | Rust Vec<u8> | C++ std::vector<char> |
|---|---|---|---|---|
| Layout | Header + data in one allocation (FAM) | Separate header + slice | Heap-allocated array | Heap-allocated array |
| Growth | 2x realloc + compact | 2x (with copy) | 2x (with copy) | Implementation-defined |
| Read/Write cursors | Yes (rpos/wpos) | Yes (read offset) | No (manual tracking) | No (manual tracking) |
| Compact | Built-in (xBufferCompact) | Built-in (implicit) | Manual | Manual |
| I/O helpers | ReadFd/WriteFd | ReadFrom/WriteTo | Via Read/Write traits | No |
| Handle invalidation | Caller updates via *bufp | GC handles | Borrow checker | Iterator invalidation |
Key Differentiator: xBuffer's single-allocation layout (flexible array member) eliminates one level of pointer indirection compared to typical buffer implementations. The compact-before-grow strategy minimizes reallocation frequency for append-consume workloads.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbuf/buf_bench.cpp
| Benchmark | Chunk Size | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_Buffer_Append | 16 | 4,776 | 4,776 | 3.1 GiB/s |
BM_Buffer_Append | 64 | 4,400 | 4,400 | 13.5 GiB/s |
BM_Buffer_Append | 256 | 7,892 | 7,892 | 30.2 GiB/s |
BM_Buffer_Append | 1,024 | 21,834 | 21,811 | 43.7 GiB/s |
BM_Buffer_Append | 4,096 | 91,029 | 90,958 | 41.9 GiB/s |
BM_Buffer_AppendConsume | 64 | 4,999 | 4,999 | 11.9 GiB/s |
BM_Buffer_AppendConsume | 256 | 8,241 | 8,240 | 28.9 GiB/s |
BM_Buffer_AppendConsume | 1,024 | 22,859 | 22,859 | 41.7 GiB/s |
Key Observations:
- Append throughput peaks at ~44 GiB/s for 1KB chunks, limited by
memcpybandwidth and reallocation overhead. - AppendConsume (interleaved append + consume) achieves comparable throughput to pure append, validating the compact-before-grow strategy — consumed space is reclaimed without reallocation.
- Small chunks (16B) show lower throughput due to per-call overhead dominating the
memcpycost.
ring.h — Fixed-Size Ring Buffer
Introduction
ring.h provides xRingBuffer, a fixed-capacity circular buffer that never reallocates. It is ideal for bounded producer-consumer scenarios where a fixed memory budget is required. The capacity is rounded up to the next power of two internally, enabling bitmask indexing instead of expensive modulo operations.
Design Philosophy
-
Fixed Capacity, Zero Reallocation — Once created, the ring buffer never grows. Writes that exceed capacity return
xErrno_NoMemory. This makes memory usage predictable and avoids allocation latency spikes. -
Power-of-Two Masking — The internal capacity is always a power of two. Index computation uses
head & maskinstead ofhead % cap, which is significantly faster on most architectures. -
Monotonic Cursors —
head(write) andtail(read) grow monotonically and never wrap. The actual array index is computed via bitmask. This simplifies the full/empty distinction:head - tailgives the exact readable byte count. -
Single Allocation — Like
xBuffer, the header and data area are allocated together using a flexible array member. -
Scatter-Gather I/O — The ring buffer provides
ReadIov/WriteIovhelpers that filliovecarrays for efficientreadv()/writev()syscalls, handling the wrap-around transparently.
Architecture
graph LR
PRODUCER["Producer"] -->|"xRingBufferWrite"| RB["xRingBuffer<br/>(fixed capacity)"]
RB -->|"xRingBufferRead"| CONSUMER["Consumer"]
RB -->|"xRingBufferReadIov"| IOV1["iovec[2]"] -->|"writev()"| FD1["fd"]
FD2["fd"] -->|"readv()"| IOV2["iovec[2]"] -->|"xRingBufferWriteIov"| RB
style RB fill:#f5a623,color:#fff
Implementation Details
Memory Layout
Single malloc() allocation:
┌───────────────────────┬──────────────────────────────────────┐
│ xRingBuffer_ header │ data[cap] (flexible array member) │
│ cap, mask, head, tail│ │
└───────────────────────┴──────────────────────────────────────┘
Circular data layout (cap=8, mask=7):
tail & mask head & mask
↓ ↓
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ │ │ R │ R │ R │ W │ │ │
└───┴───┴───┴───┴───┴───┴───┴───┘
0 1 2 3 4 5 6 7
R = readable data (tail..head)
W = next write position
Internal Structure
XDEF_STRUCT(xRingBuffer_) {
size_t cap; // Capacity (power of two)
size_t mask; // cap - 1 (for bitmask indexing)
size_t head; // Write cursor (monotonic)
size_t tail; // Read cursor (monotonic)
char data[];// Flexible array member
};
Power-of-Two Rounding
static size_t next_pow2(size_t v) {
if (v < 16) v = 16;
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
// v |= v >> 32; (on 64-bit)
return v + 1;
}
This ensures cap is always a power of two, so mask = cap - 1 produces a valid bitmask. For example, cap = 8 → mask = 0b111.
Bitmask Indexing
Instead of:
size_t idx = head % cap; // Expensive division
The ring buffer uses:
size_t idx = head & mask; // Single AND instruction
This works because cap is a power of two: x % (2^n) == x & (2^n - 1).
Wrap-Around Write
flowchart TD
WRITE["xRingBufferWrite(rb, data, len)"]
CHECK{"len <= writable?"}
FAIL["Return xErrno_NoMemory"]
POS["pos = head & mask"]
FIRST["first = cap - pos"]
WRAP{"len <= first?"}
SINGLE["memcpy(data+pos, src, len)"]
SPLIT["memcpy(data+pos, src, first)<br/>memcpy(data, src+first, len-first)"]
ADVANCE["head += len"]
WRITE --> CHECK
CHECK -->|No| FAIL
CHECK -->|Yes| POS --> FIRST --> WRAP
WRAP -->|Yes| SINGLE --> ADVANCE
WRAP -->|No| SPLIT --> ADVANCE
style FAIL fill:#e74c3c,color:#fff
style ADVANCE fill:#50b86c,color:#fff
Operations and Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
xRingBufferWrite | O(n) | Up to 2 memcpy calls |
xRingBufferRead | O(n) | Up to 2 memcpy calls |
xRingBufferPeek | O(n) | Like Read but doesn't advance tail |
xRingBufferDiscard | O(1) | Just advances tail |
xRingBufferLen | O(1) | head - tail |
xRingBufferReadFd | O(1) | Single readv() syscall |
xRingBufferWriteFd | O(1) | Single writev() syscall |
API Reference
Lifecycle
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferCreate | xRingBuffer xRingBufferCreate(size_t min_cap) | Create a ring buffer. Capacity rounded up to power of 2. | Not thread-safe |
xRingBufferDestroy | void xRingBufferDestroy(xRingBuffer rb) | Free the ring buffer. NULL is a no-op. | Not thread-safe |
xRingBufferReset | void xRingBufferReset(xRingBuffer rb) | Discard all data, keep memory. | Not thread-safe |
Query
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferLen | size_t xRingBufferLen(xRingBuffer rb) | Readable bytes. | Not thread-safe |
xRingBufferCap | size_t xRingBufferCap(xRingBuffer rb) | Total capacity. | Not thread-safe |
xRingBufferWritable | size_t xRingBufferWritable(xRingBuffer rb) | Writable bytes. | Not thread-safe |
xRingBufferEmpty | bool xRingBufferEmpty(xRingBuffer rb) | True if no readable data. | Not thread-safe |
xRingBufferFull | bool xRingBufferFull(xRingBuffer rb) | True if no writable space. | Not thread-safe |
Write
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferWrite | xErrno xRingBufferWrite(xRingBuffer rb, const void *data, size_t len) | Write bytes. Returns xErrno_NoMemory if full. | Not thread-safe |
Read
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferRead | size_t xRingBufferRead(xRingBuffer rb, void *out, size_t len) | Read and consume bytes. Returns actual count. | Not thread-safe |
xRingBufferPeek | size_t xRingBufferPeek(xRingBuffer rb, void *out, size_t len) | Read without consuming. | Not thread-safe |
xRingBufferDiscard | size_t xRingBufferDiscard(xRingBuffer rb, size_t n) | Discard bytes without copying. | Not thread-safe |
I/O Helpers
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xRingBufferReadIov | int xRingBufferReadIov(xRingBuffer rb, struct iovec iov[2]) | Fill iovecs with readable regions (for writev). | Not thread-safe |
xRingBufferWriteIov | int xRingBufferWriteIov(xRingBuffer rb, struct iovec iov[2]) | Fill iovecs with writable regions (for readv). | Not thread-safe |
xRingBufferReadFd | ssize_t xRingBufferReadFd(xRingBuffer rb, int fd) | Read from fd using readv(). | Not thread-safe |
xRingBufferWriteFd | ssize_t xRingBufferWriteFd(xRingBuffer rb, int fd) | Write to fd using writev(). | Not thread-safe |
Usage Examples
Basic FIFO
#include <stdio.h>
#include <xbuf/ring.h>
int main(void) {
// Request 1000 bytes; actual capacity will be 1024 (next power of 2)
xRingBuffer rb = xRingBufferCreate(1000);
printf("Capacity: %zu\n", xRingBufferCap(rb)); // 1024
// Write data
const char *msg = "Hello, Ring!";
xRingBufferWrite(rb, msg, 12);
// Read data
char out[32];
size_t n = xRingBufferRead(rb, out, sizeof(out));
printf("Read %zu bytes: %.*s\n", n, (int)n, out);
xRingBufferDestroy(rb);
return 0;
}
Network Socket Buffer
#include <xbuf/ring.h>
void event_loop_handler(int sockfd) {
xRingBuffer rb = xRingBufferCreate(65536); // 64KB ring
// Read from socket into ring buffer
ssize_t n = xRingBufferReadFd(rb, sockfd);
if (n > 0) {
// Process data...
// Write processed data back
xRingBufferWriteFd(rb, sockfd);
}
xRingBufferDestroy(rb);
}
Use Cases
-
Fixed-Budget Network Buffers — When you need predictable memory usage per connection (e.g., 64KB per socket), the ring buffer provides a hard capacity limit.
-
Logging Ring Buffer — Capture the last N bytes of log output, automatically discarding old data when the buffer wraps.
-
Inter-Thread Communication — With external synchronization, a ring buffer can serve as a bounded channel between producer and consumer threads.
Best Practices
- Choose capacity carefully. The ring buffer never grows. If you write more than the capacity, the write fails. Size it for your worst-case scenario.
- Use scatter-gather I/O.
xRingBufferReadFd/WriteFdusereadv()/writev()to handle wrap-around in a single syscall, avoiding the need to linearize data. - Be aware of power-of-two rounding. Requesting 1000 bytes gives you 1024. Requesting 1025 gives you 2048. Plan accordingly.
- Check
xRingBufferWritable()before writing if you want to handle partial writes gracefully.
Comparison with Other Libraries
| Feature | xbuf ring.h | Linux kfifo | Boost circular_buffer | DPDK rte_ring |
|---|---|---|---|---|
| Capacity | Fixed, power-of-2 | Fixed, power-of-2 | Fixed, any size | Fixed, power-of-2 |
| Indexing | Bitmask | Bitmask | Modulo | Bitmask |
| Layout | FAM (single alloc) | Separate alloc | Heap array | Huge pages |
| Thread Safety | Not thread-safe | Single-producer/single-consumer | Not thread-safe | Multi-producer/multi-consumer |
| I/O Helpers | readv/writev | kfifo_to_user/kfifo_from_user | No | No (packet-oriented) |
| Language | C99 | C (kernel) | C++ | C |
Key Differentiator: xbuf's ring buffer combines the power-of-two bitmask optimization (like kfifo) with scatter-gather I/O helpers (readv/writev) in a single-allocation design. It's purpose-built for event-driven network programming where fixed memory budgets and efficient syscalls are essential.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbuf/ring_bench.cpp
| Benchmark | Size | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_Ring_WriteRead | 64 | 6.05 | 6.05 | 19.7 GiB/s |
BM_Ring_WriteRead | 256 | 16.8 | 16.8 | 28.4 GiB/s |
BM_Ring_WriteRead | 1,024 | 27.4 | 27.4 | 69.6 GiB/s |
BM_Ring_WriteRead | 4,096 | 99.2 | 99.2 | 76.9 GiB/s |
BM_Ring_Throughput | 4,096 | 225 | 225 | 17.0 GiB/s |
BM_Ring_Throughput | 16,384 | 806 | 806 | 18.9 GiB/s |
BM_Ring_Throughput | 65,536 | 3,198 | 3,198 | 19.1 GiB/s |
Key Observations:
- WriteRead (single write + read cycle) achieves up to ~77 GiB/s at 4KB chunks, demonstrating the efficiency of the bitmask-based wrap-around and
memcpyfor larger transfers. - Throughput (sustained writes until full) stabilizes at ~19 GiB/s regardless of capacity, showing consistent performance as the ring scales.
- The ring buffer's zero-overhead indexing (bitmask instead of modulo) keeps per-operation cost extremely low — just 6ns for a 64-byte write+read cycle.
io.h — Reference-Counted Block-Chain I/O Buffer
Introduction
io.h provides xIOBuffer, a non-contiguous byte buffer composed of a chain of reference-counted memory blocks. It supports zero-copy split, append, and scatter-gather I/O (readv/writev). Inspired by brpc's IOBuf, it is designed for high-throughput network I/O where avoiding memory copies is critical.
Design Philosophy
-
Block-Chain Architecture — Data is stored across multiple fixed-size blocks (default 8KB each), linked through a reference array. This avoids large contiguous allocations and enables zero-copy operations.
-
Reference Counting — Each
xIOBlockis reference-counted. MultiplexIOBufferinstances can share the same block (e.g., after aCutoperation). Blocks are freed (returned to pool) when the last reference is released. -
Zero-Copy Operations —
xIOBufferAppendIOBuffer()transfers block references without copying data.xIOBufferCut()splits a buffer by adjusting offsets and sharing blocks at the boundary. -
Lock-Free Block Pool — Released blocks are returned to a global Treiber stack (lock-free) for reuse, avoiding
malloc/freeoverhead in steady state. -
Inline Ref Array — Small buffers (≤ 8 refs) use an inline array, avoiding heap allocation for the ref array itself. Larger buffers transition to a heap-allocated array.
Architecture
graph TD
subgraph "xIOBuffer API"
APPEND["Append / AppendStr"]
APPEND_IO["AppendIOBuffer<br/>(zero-copy)"]
READ["Read / CopyTo"]
CUT["Cut<br/>(zero-copy split)"]
CONSUME["Consume"]
IO_READ["ReadFd"]
IO_WRITE["WriteFd<br/>(writev)"]
end
subgraph "Block Management"
ACQUIRE["xIOBlockAcquire"]
RETAIN["xIOBlockRetain"]
RELEASE["xIOBlockRelease"]
end
subgraph "Block Pool (Treiber Stack)"
POOL["g_pool_head"]
WARMUP["xIOBlockPoolWarmup"]
DRAIN["xIOBlockPoolDrain"]
end
APPEND --> ACQUIRE
IO_READ --> ACQUIRE
CUT --> RETAIN
CONSUME --> RELEASE
READ --> RELEASE
ACQUIRE --> POOL
RELEASE --> POOL
WARMUP --> POOL
DRAIN --> POOL
style POOL fill:#f5a623,color:#fff
Implementation Details
Block Structure
XDEF_STRUCT(xIOBlock) {
size_t refs; // Reference count (atomic)
size_t size; // Usable data size
char data[XIOBUFFER_BLOCK_SIZE]; // 8KB inline data
};
Reference Structure
XDEF_STRUCT(xIOBufferRef) {
xIOBlock *block; // Pointer to the underlying block
size_t offset; // Start offset within block->data
size_t length; // Number of valid bytes from offset
};
IOBuffer Structure
XDEF_STRUCT(xIOBuffer) {
xIOBufferRef inlined[XIOBUFFER_INLINE_REFS]; // Inline ref storage (8)
xIOBufferRef *refs; // Pointer to ref array (inlined or heap)
size_t nrefs; // Number of active refs
size_t cap; // Capacity of refs array
size_t nbytes; // Total logical byte count (cached)
};
Block-Chain Architecture
graph TD
subgraph "xIOBuffer"
REF1["Ref 0<br/>block=A, off=0, len=8192"]
REF2["Ref 1<br/>block=B, off=0, len=8192"]
REF3["Ref 2<br/>block=C, off=0, len=3000"]
end
subgraph "Shared Blocks"
A["xIOBlock A<br/>refs=1, 8KB"]
B["xIOBlock B<br/>refs=2, 8KB"]
C["xIOBlock C<br/>refs=1, 8KB"]
end
REF1 --> A
REF2 --> B
REF3 --> C
subgraph "Another xIOBuffer (after Cut)"
REF4["Ref 0<br/>block=B, off=4096, len=4096"]
end
REF4 --> B
style A fill:#4a90d9,color:#fff
style B fill:#f5a623,color:#fff
style C fill:#50b86c,color:#fff
Treiber Stack Block Pool
The global block pool uses a lock-free Treiber stack:
// Pool node overlays xIOBlock memory
XDEF_STRUCT(PoolNode_) {
PoolNode_ *next;
};
static PoolNode_ *volatile g_pool_head = NULL;
Push (return to pool):
do {
head = atomic_load(g_pool_head)
node->next = head
} while (!CAS(g_pool_head, head, node))
Pop (acquire from pool):
do {
head = atomic_load(g_pool_head)
if (!head) return malloc(new block)
next = head->next
} while (!CAS(g_pool_head, head, next))
return head
Zero-Copy Cut
xIOBufferCut(io, dst, n) moves the first n bytes from io to dst:
- Fully consumed refs — Ownership transfers directly (no refcount change).
- Boundary ref — The block is shared:
xIOBlockRetain()increments the refcount, and both buffers hold a ref with different offset/length.
flowchart TD
CUT["xIOBufferCut(io, dst, n)"]
LOOP{"More bytes to cut?"}
FULL{"ref.length <= remaining?"}
TRANSFER["Transfer entire ref to dst<br/>(no refcount change)"]
SPLIT["Share block: Retain + split ref<br/>dst gets [offset, chunk]<br/>io keeps [offset+chunk, rest]"]
SHIFT["Shift consumed refs out of io"]
DONE["Update nbytes for both"]
CUT --> LOOP
LOOP -->|Yes| FULL
FULL -->|Yes| TRANSFER --> LOOP
FULL -->|No| SPLIT --> SHIFT --> DONE
LOOP -->|No| SHIFT
style TRANSFER fill:#50b86c,color:#fff
style SPLIT fill:#f5a623,color:#fff
Append Strategy
xIOBufferAppend(io, data, len):
- First tries to fill the tail block's remaining space (avoids allocating a new block for small appends).
- Allocates new blocks for remaining data, each up to
XIOBUFFER_BLOCK_SIZEbytes.
API Reference
Configuration
| Macro | Default | Description |
|---|---|---|
XIOBUFFER_BLOCK_SIZE | 8192 | Block data size in bytes |
XIOBUFFER_INLINE_REFS | 8 | Inline ref array capacity |
Block API
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBlockAcquire | xIOBlock *xIOBlockAcquire(void) | Get a block from pool (or malloc). refs=1. | Thread-safe (lock-free pool) |
xIOBlockRetain | void xIOBlockRetain(xIOBlock *blk) | Increment refcount. | Thread-safe (atomic) |
xIOBlockRelease | void xIOBlockRelease(xIOBlock *blk) | Decrement refcount; return to pool at 0. | Thread-safe (atomic + lock-free pool) |
xIOBlockPoolWarmup | xErrno xIOBlockPoolWarmup(size_t n) | Pre-allocate n blocks into pool. | Thread-safe |
xIOBlockPoolDrain | void xIOBlockPoolDrain(void) | Free all pooled blocks. Call at shutdown. | Not thread-safe (no concurrent use) |
IOBuffer Lifecycle
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBufferInit | void xIOBufferInit(xIOBuffer *io) | Initialize an empty IOBuffer. | Not thread-safe |
xIOBufferDeinit | void xIOBufferDeinit(xIOBuffer *io) | Release all refs and free ref array. | Not thread-safe |
xIOBufferReset | void xIOBufferReset(xIOBuffer *io) | Release all refs, keep ref array. | Not thread-safe |
IOBuffer Query
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBufferLen | size_t xIOBufferLen(const xIOBuffer *io) | Total readable bytes. | Not thread-safe |
xIOBufferEmpty | bool xIOBufferEmpty(const xIOBuffer *io) | True if no data. | Not thread-safe |
xIOBufferRefCount | size_t xIOBufferRefCount(const xIOBuffer *io) | Number of block refs. | Not thread-safe |
IOBuffer Write
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBufferAppend | xErrno xIOBufferAppend(xIOBuffer *io, const void *data, size_t len) | Append bytes (allocates blocks as needed). | Not thread-safe |
xIOBufferAppendStr | xErrno xIOBufferAppendStr(xIOBuffer *io, const char *str) | Append C string. | Not thread-safe |
xIOBufferAppendIOBuffer | xErrno xIOBufferAppendIOBuffer(xIOBuffer *io, xIOBuffer *other) | Zero-copy: move all refs from other. | Not thread-safe |
IOBuffer Read
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBufferRead | size_t xIOBufferRead(xIOBuffer *io, void *out, size_t len) | Copy and consume bytes. | Not thread-safe |
xIOBufferCut | size_t xIOBufferCut(xIOBuffer *io, xIOBuffer *dst, size_t n) | Zero-copy split: move first n bytes to dst. | Not thread-safe |
xIOBufferConsume | size_t xIOBufferConsume(xIOBuffer *io, size_t n) | Discard first n bytes. | Not thread-safe |
xIOBufferCopyTo | size_t xIOBufferCopyTo(const xIOBuffer *io, void *out) | Linearize: copy all data to contiguous buffer. | Not thread-safe |
IOBuffer I/O
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xIOBufferReadIov | int xIOBufferReadIov(const xIOBuffer *io, struct iovec *iov, int max_iov) | Fill iovecs for writev(). | Not thread-safe |
xIOBufferReadFd | ssize_t xIOBufferReadFd(xIOBuffer *io, int fd) | Read from fd into IOBuffer. | Not thread-safe |
xIOBufferWriteFd | ssize_t xIOBufferWriteFd(xIOBuffer *io, int fd) | Write to fd using writev(). | Not thread-safe |
Usage Examples
Basic Usage
#include <stdio.h>
#include <xbuf/io.h>
int main(void) {
xIOBuffer io;
xIOBufferInit(&io);
// Append data (may span multiple blocks)
xIOBufferAppend(&io, "Hello, ", 7);
xIOBufferAppend(&io, "IOBuffer!", 9);
printf("Length: %zu, Refs: %zu\n",
xIOBufferLen(&io), xIOBufferRefCount(&io));
// Linearize for processing
char buf[64];
xIOBufferCopyTo(&io, buf);
printf("Content: %.*s\n", (int)xIOBufferLen(&io), buf);
xIOBufferDeinit(&io);
return 0;
}
Zero-Copy Split (Protocol Parsing)
#include <xbuf/io.h>
void parse_protocol(xIOBuffer *io) {
// Cut the 4-byte header from the front
xIOBuffer header;
xIOBufferInit(&header);
size_t cut = xIOBufferCut(io, &header, 4);
if (cut == 4) {
char hdr[4];
xIOBufferRead(&header, hdr, 4);
// Parse header...
// io now contains only the body (zero-copy!)
}
xIOBufferDeinit(&header);
}
High-Throughput Network I/O
#include <xbuf/io.h>
void handle_data(int sockfd) {
// Pre-warm the block pool at startup
xIOBlockPoolWarmup(64);
xIOBuffer io;
xIOBufferInit(&io);
// Read from socket (allocates blocks from pool)
ssize_t n = xIOBufferReadFd(&io, sockfd);
if (n > 0) {
// Write back using scatter-gather I/O
xIOBufferWriteFd(&io, sockfd);
}
xIOBufferDeinit(&io);
// At shutdown
xIOBlockPoolDrain();
}
Use Cases
-
HTTP Response Body — The
xhttpmodule usesxIOBufferto accumulate response chunks from libcurl without copying between buffers. -
Protocol Framing — Use
xIOBufferCut()to split headers from body in a zero-copy fashion, then process each part independently. -
Data Pipeline — Chain multiple processing stages that each append to or cut from
xIOBufferinstances, sharing blocks to minimize copies.
Best Practices
- Call
xIOBlockPoolWarmup()at startup to pre-allocate blocks and avoid allocation spikes during initial traffic. - Call
xIOBlockPoolDrain()at shutdown for clean valgrind reports. - Use
xIOBufferAppendIOBuffer()instead of copying when combining buffers. It transfers ownership without data copies. - Use
xIOBufferCut()for protocol parsing. It's more efficient thanxIOBufferRead()when you need to pass the cut data to another component. - Monitor
xIOBufferRefCount()to understand memory fragmentation. Many small refs may indicate suboptimal block utilization.
Comparison with Other Libraries
| Feature | xbuf io.h | brpc IOBuf | Netty ByteBuf | Go bytes.Buffer |
|---|---|---|---|---|
| Architecture | Block-chain (ref array) | Block-chain (linked list) | Composite buffer | Contiguous slice |
| Block Size | 8KB (configurable) | 8KB | Configurable | N/A |
| Reference Counting | Atomic (per block) | Atomic (per block) | Atomic (per buffer) | GC |
| Zero-Copy Split | xIOBufferCut | cutn | slice | No |
| Zero-Copy Append | xIOBufferAppendIOBuffer | append(IOBuf) | addComponent | No |
| Block Pool | Treiber stack (lock-free) | Thread-local + global | Arena allocator | N/A |
| Scatter-Gather I/O | writev via ReadIov | writev via pappend | nioBuffers | No |
| Inline Optimization | 8 inline refs | No | No | N/A |
| Language | C99 | C++ | Java | Go |
Key Differentiator: xbuf's xIOBuffer combines brpc-style block-chain architecture with a lock-free Treiber stack block pool and inline ref optimization. The zero-copy Cut and AppendIOBuffer operations make it ideal for protocol parsing and data pipeline scenarios in C.
Benchmark
Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (
-O2). Source:xbuf/io_bench.cpp
| Benchmark | Size | Time (ns) | CPU (ns) | Throughput |
|---|---|---|---|---|
BM_IOBuffer_Append | 64 | 3,720 | 3,720 | 16.0 GiB/s |
BM_IOBuffer_Append | 256 | 7,569 | 7,568 | 31.5 GiB/s |
BM_IOBuffer_Append | 1,024 | 22,341 | 22,340 | 42.7 GiB/s |
BM_IOBuffer_Append | 4,096 | 79,796 | 79,794 | 47.8 GiB/s |
BM_IOBuffer_Append | 8,192 | 187,167 | 187,165 | 40.8 GiB/s |
BM_IOBuffer_AppendConsume | 64 | 5,230 | 5,230 | 11.4 GiB/s |
BM_IOBuffer_AppendConsume | 256 | 8,232 | 8,232 | 29.0 GiB/s |
BM_IOBuffer_AppendConsume | 1,024 | 23,040 | 23,040 | 41.4 GiB/s |
BM_IOBuffer_Cut | 8,192 | 167 | 167 | 45.6 GiB/s |
BM_IOBuffer_Cut | 65,536 | 1,651 | 1,651 | 37.0 GiB/s |
BM_IOBuffer_Cut | 262,144 | 8,122 | 8,122 | 30.1 GiB/s |
BM_IOBuffer_AppendIOBuffer | 1,024 | 3,196 | 3,196 | 29.8 GiB/s |
BM_IOBuffer_AppendIOBuffer | 4,096 | 9,307 | 9,307 | 41.0 GiB/s |
BM_IOBuffer_AppendIOBuffer | 8,192 | 17,604 | 17,602 | 43.3 GiB/s |
BM_IOBuffer_BlockPool | — | 8.91 | 8.89 | — |
Key Observations:
- Append peaks at ~48 GiB/s for 4KB chunks. The slight drop at 8KB reflects block boundary crossing overhead.
- Cut (zero-copy split) is extremely fast — 167ns for 8KB — because it only manipulates reference metadata, not data. This validates the block-chain architecture for protocol parsing.
- AppendIOBuffer (zero-copy concatenation) achieves ~43 GiB/s, confirming that block ownership transfer avoids data copies.
- BlockPool acquire/release cycle takes ~9ns, showing the lock-free Treiber stack's efficiency for block recycling.
xnet — Networking Primitives
Introduction
xnet is xKit's networking utility module, providing three foundational components for network programming: a lightweight URL parser, an asynchronous DNS resolver, and shared TLS configuration types. These building blocks are used internally by higher-level modules like xhttp, and are also available for direct use in application code.
Design Philosophy
-
Zero-Copy URL Parsing —
xUrlParse()makes a single internal copy of the input string. All component fields (scheme, host, port, etc.) are pointer+length pairs referencing this copy, avoiding per-field allocations. -
Async DNS via Thread-Pool Offload — DNS resolution uses
getaddrinfo()offloaded to the event loop's thread pool. The callback is always invoked on the event loop thread, keeping the async programming model consistent with the rest of xKit. -
Shared TLS Types —
xTlsConfis a plain data structure shared across modules. It decouples TLS configuration from any specific TLS backend (OpenSSL, mbedTLS). -
Async TCP with Transport Abstraction —
xTcpConnectchains DNS → connect → optional TLS handshake into a single async operation.xTcpConnwraps anxSocket+xTransportvtable, providingRecv/Send/SendIovhelpers that work transparently over plain TCP or TLS.
Architecture
graph TD
subgraph "xnet Module"
URL["xUrl<br/>URL Parser<br/>url.h"]
DNS["xDnsResolve<br/>Async DNS<br/>dns.h"]
TLS["xTlsConf<br/>TLS Config Types<br/>tls.h"]
TCP["xTcpConn / xTcpConnect / xTcpListener<br/>Async TCP<br/>tcp.h"]
end
subgraph "xbase Infrastructure"
EV["xEventLoop<br/>event.h"]
POOL["Thread Pool<br/>xEventLoopSubmit()"]
ATOMIC["Atomic Ops<br/>atomic.h"]
end
subgraph "Consumers"
HTTP_C["xhttp Client"]
HTTP_S["xhttp Server"]
WS["WebSocket"]
end
DNS --> EV
DNS --> POOL
DNS --> ATOMIC
TCP --> EV
TCP --> DNS
TCP --> TLS
HTTP_C --> URL
HTTP_C --> TCP
HTTP_S --> TCP
WS --> URL
WS --> TCP
style URL fill:#4a90d9,color:#fff
style DNS fill:#50b86c,color:#fff
style TLS fill:#f5a623,color:#fff
style TCP fill:#e74c3c,color:#fff
Sub-Module Overview
| Header | Component | Description | Doc |
|---|---|---|---|
url.h | xUrl | Lightweight URL parser | url.md |
dns.h | xDnsResolve | Async DNS resolution | dns.md |
tls.h | xTlsConf | Shared TLS config types | tls.md |
tcp.h | xTcpConn / xTcpConnect / xTcpListener | Async TCP connection, connector & listener | tcp.md |
Quick Start
#include <stdio.h>
#include <xbase/event.h>
#include <xnet/url.h>
#include <xnet/dns.h>
#include <xnet/tls.h>
// 1. Parse a URL
static void url_example(void) {
xUrl url;
xErrno err = xUrlParse(
"wss://example.com:8443/ws?token=abc", &url);
if (err == xErrno_Ok) {
printf("scheme: %.*s\n",
(int)url.scheme_len, url.scheme);
printf("host: %.*s\n",
(int)url.host_len, url.host);
printf("port: %u\n", xUrlPort(&url));
printf("path: %.*s\n",
(int)url.path_len, url.path);
xUrlFree(&url);
}
}
// 2. Async DNS resolution
static void on_resolved(xDnsResult *result, void *arg) {
(void)arg;
if (result->error == xErrno_Ok) {
int count = 0;
for (xDnsAddr *a = result->addrs; a; a = a->next)
count++;
printf("Resolved %d address(es)\n", count);
}
xDnsResultFree(result);
// stop the loop after resolution
}
static void dns_example(xEventLoop loop) {
xDnsResolve(loop, "example.com", "443",
NULL, on_resolved, NULL);
}
// 3. TLS configuration
static void tls_example(void) {
xTlsConf client_tls = {0};
client_tls.ca = "ca.pem";
xTlsConf server_tls = {
.cert = "server.pem",
.key = "server-key.pem",
};
(void)client_tls;
(void)server_tls;
}
Relationship with Other Modules
- xbase — The DNS resolver depends on
xEventLoopfor thread-pool offload and usesatomic.hfor the cancellation flag. - xhttp — The HTTP client uses
xUrlfor URL parsing,xDnsResolvefor hostname resolution, andxTlsConffor TLS configuration. The WebSocket client supports bothxTlsConfand a sharedxTlsCtxforwss://connections. See the TLS Deployment Guide for end-to-end examples. - WebSocket — The WebSocket client uses
xUrlto parsews://andwss://URLs, and optionally accepts a sharedxTlsCtxto avoid per-connection TLS context creation.
url.h — Lightweight URL Parser
Introduction
url.h provides xUrl, a lightweight URL parser that decomposes a URL string into its RFC 3986 components: scheme, userinfo, host, port, path, query, and fragment. The parser makes a single internal copy of the input; all component fields are pointer+length pairs referencing this copy, so the caller may discard the original string immediately after parsing.
Design Philosophy
-
Single Copy, Zero Per-Field Allocation —
xUrlParse()callsstrdup()once. All output fields point into this copy, avoiding per-component heap allocations. -
Pointer+Length Pairs — Fields use
const char *+size_tpairs rather than NUL-terminated strings. This avoids mutating the internal copy and supports efficient substring access. -
Scheme-Aware Default Ports —
xUrlPort()returns well-known default ports (80 for http/ws, 443 for https/wss) when no explicit port is present, simplifying connection logic. -
IPv6 Literal Support — The parser correctly handles bracketed IPv6 addresses (
[::1]:8080), extracting the bare address without brackets.
Architecture
flowchart LR
INPUT["Raw URL string"]
PARSE["xUrlParse()"]
COPY["strdup() internal copy"]
FIELDS["Pointer+Length fields"]
PORT["xUrlPort()"]
FREE["xUrlFree()"]
INPUT --> PARSE
PARSE --> COPY
COPY --> FIELDS
FIELDS --> PORT
FIELDS --> FREE
style PARSE fill:#4a90d9,color:#fff
style FREE fill:#e74c3c,color:#fff
Implementation Details
URL Format
scheme://[userinfo@]host[:port][/path][?query][#fragment]
Parsing Steps
flowchart TD
START["Input: raw URL string"]
SCHEME["Find '://' → extract scheme"]
AUTH["Parse authority section"]
USERINFO{"Contains '@'?"}
UI_YES["Extract userinfo"]
HOST{"Starts with '['?"}
IPV6["Parse IPv6 bracket literal"]
IPV4["Scan backwards for ':'"]
PORT["Extract port (if present)"]
PATH{"Starts with '/'?"}
PATH_YES["Extract path"]
QUERY{"Starts with '?'?"}
QUERY_YES["Extract query"]
FRAG{"Starts with '#'?"}
FRAG_YES["Extract fragment"]
DONE["Return xErrno_Ok"]
START --> SCHEME --> AUTH
AUTH --> USERINFO
USERINFO -->|Yes| UI_YES --> HOST
USERINFO -->|No| HOST
HOST -->|Yes| IPV6 --> PORT
HOST -->|No| IPV4 --> PORT
PORT --> PATH
PATH -->|Yes| PATH_YES --> QUERY
PATH -->|No| QUERY
QUERY -->|Yes| QUERY_YES --> FRAG
QUERY -->|No| FRAG
FRAG -->|Yes| FRAG_YES --> DONE
FRAG -->|No| DONE
style DONE fill:#50b86c,color:#fff
Memory Layout
xUrl struct (stack or heap):
┌──────────┬──────────────────────────────────┐
│ raw_ │→ strdup("https://host:443/path") │
│ scheme │→ ───────┘ │
│ host │→ ──────────────┘ │
│ port │→ ───────────────────┘ │
│ path │→ ────────────────────────┘ │
│ ... │ │
└──────────┴──────────────────────────────────┘
All pointers reference the single raw_ copy.
Operations and Complexity
| Operation | Complexity | Notes |
|---|---|---|
xUrlParse | O(n) | Single pass over the URL string |
xUrlPort | O(1) | Converts port string or returns default |
xUrlFree | O(1) | Frees the internal copy, zeroes struct |
API Reference
Lifecycle
| Function | Signature | Description |
|---|---|---|
xUrlParse | xErrno xUrlParse(const char *raw, xUrl *url) | Parse a URL into components |
xUrlFree | void xUrlFree(xUrl *url) | Free internal copy, zero all fields |
Query
| Function | Signature | Description |
|---|---|---|
xUrlPort | uint16_t xUrlPort(const xUrl *url) | Numeric port (explicit or default by scheme) |
xUrl Fields
| Field | Type | Description |
|---|---|---|
scheme / scheme_len | const char * / size_t | e.g. "https" |
userinfo / userinfo_len | const char * / size_t | e.g. "user:pass" (optional) |
host / host_len | const char * / size_t | e.g. "example.com" or "::1" |
port / port_len | const char * / size_t | e.g. "8443" (optional) |
path / path_len | const char * / size_t | e.g. "/ws/chat" (optional) |
query / query_len | const char * / size_t | e.g. "key=val" (optional) |
fragment / fragment_len | const char * / size_t | e.g. "section1" (optional) |
Note: Optional fields have
ptr=NULL, len=0when absent. Theraw_field is internal — do not access it.
Usage Examples
Basic URL Parsing
#include <stdio.h>
#include <xnet/url.h>
int main(void) {
xUrl url;
xErrno err = xUrlParse("https://user:[email protected]:8443/ws/chat?token=abc#top", &url);
if (err != xErrno_Ok) {
fprintf(stderr, "parse failed\n");
return 1;
}
printf("scheme: %.*s\n", (int)url.scheme_len, url.scheme);
printf("userinfo: %.*s\n", (int)url.userinfo_len, url.userinfo);
printf("host: %.*s\n", (int)url.host_len, url.host);
printf("port: %.*s (numeric: %u)\n", (int)url.port_len, url.port, xUrlPort(&url));
printf("path: %.*s\n", (int)url.path_len, url.path);
printf("query: %.*s\n", (int)url.query_len, url.query);
printf("fragment: %.*s\n", (int)url.fragment_len, url.fragment);
xUrlFree(&url);
return 0;
}
Output:
scheme: https
userinfo: user:pass
host: example.com
port: 8443 (numeric: 8443)
path: /ws/chat
query: token=abc
fragment: top
IPv6 Address
xUrl url;
xUrlParse("http://[::1]:8080/test", &url);
printf("host: %.*s\n", (int)url.host_len, url.host);
// Output: host: ::1 (brackets stripped)
printf("port: %u\n", xUrlPort(&url));
// Output: port: 8080
xUrlFree(&url);
Default Port by Scheme
xUrl url;
xUrlParse("wss://echo.example.com/sock", &url);
// No explicit port in URL
printf("port field: %s\n", url.port ? "present" : "absent");
// Output: port field: absent
// xUrlPort() returns 443 for wss://
printf("effective port: %u\n", xUrlPort(&url));
// Output: effective port: 443
xUrlFree(&url);
Ownership Semantics
// xUrl owns its data — the original string can be freed
char *heap = strdup("ws://example.com:9090/ws");
xUrl url;
xUrlParse(heap, &url);
free(heap); // safe: xUrl has its own copy
// url fields are still valid here
printf("host: %.*s\n", (int)url.host_len, url.host);
xUrlFree(&url);
// After free, all fields are zeroed (NULL)
Error Handling
| Input | Result |
|---|---|
NULL raw or url pointer | xErrno_InvalidArg |
Missing :// separator | xErrno_InvalidArg |
Empty host (e.g. http:///path) | xErrno_InvalidArg |
| Unclosed IPv6 bracket | xErrno_InvalidArg |
malloc failure | xErrno_NoMemory |
On error, the xUrl struct is zeroed — no cleanup needed.
Best Practices
- Always check the return value of
xUrlParse(). On error the struct is zeroed, so accessing fields is safe but yields empty values. - Use
xUrlPort()instead of parsing the port string yourself. It handles default ports and validates the numeric range (0–65535). - Call
xUrlFree()when done. Forgetting to free leaks the internal string copy. - Don't cache field pointers past
xUrlFree(). All pointers become invalid after the free call.
dns.h — Asynchronous DNS Resolution
Introduction
dns.h provides asynchronous DNS resolution by offloading getaddrinfo() to the event loop's thread pool. The completion callback is always invoked on the event loop thread, maintaining xKit's single-threaded callback model. Queries can be cancelled before the callback fires.
Design Philosophy
-
Thread-Pool Offload —
getaddrinfo()is a blocking POSIX call. Rather than introducing a dedicated DNS thread, xnet reuses the event loop's existing thread pool viaxEventLoopSubmit(). -
Event-Loop-Thread Callbacks — The done callback runs on the event loop thread, so user code never needs synchronization. This is consistent with every other callback in xKit.
-
Linked-List Result — Resolved addresses are returned as a linked list of
xDnsAddrnodes, preserving the fullgetaddrinfo()result (family, socktype, protocol) for each address. -
Cancellation Support —
xDnsCancel()sets an atomic flag. If the worker has already finished, the done callback silently discards the result instead of invoking the user callback. -
IP Literal Fast Path — If the hostname is an IPv4 or IPv6 literal,
AI_NUMERICHOSTis set automatically, skipping the actual DNS lookup.
Architecture
sequenceDiagram
participant App as Application
participant EL as Event Loop Thread
participant TP as Thread Pool Worker
App->>EL: xDnsResolve(loop, "example.com", ...)
EL->>TP: xEventLoopSubmit(dns_work_fn)
Note over TP: getaddrinfo() (blocking)
TP-->>EL: dns_done_fn(result)
alt Not cancelled
EL->>App: callback(result, arg)
else Cancelled
EL->>EL: xDnsResultFree(result)
end
Implementation Details
Internal Request Lifecycle
stateDiagram-v2
[*] --> Created: xDnsResolve()
Created --> Queued: xEventLoopSubmit()
Queued --> Working: Thread pool picks up
Working --> Done: getaddrinfo() returns
Done --> Delivered: callback invoked
Done --> Discarded: cancelled flag set
Queued --> Cancelled: xDnsCancel()
Working --> Cancelled: xDnsCancel()
Cancelled --> Discarded: done_fn checks flag
Delivered --> [*]: request freed
Discarded --> [*]: request freed
Error Mapping
getaddrinfo() returns EAI_* codes. These are mapped to xKit error codes:
| EAI Code | xErrno | Meaning |
|---|---|---|
0 (success) | xErrno_Ok | Resolution succeeded |
EAI_NONAME | xErrno_DnsNotFound | Host not found |
EAI_AGAIN | xErrno_DnsTempFail | Temporary failure |
EAI_MEMORY | xErrno_NoMemory | Out of memory |
| Other | xErrno_DnsError | Generic DNS error |
IP Literal Detection
Before calling getaddrinfo(), the worker checks if the hostname is an IP literal using inet_pton(). If it is, AI_NUMERICHOST is added to the hints, which tells getaddrinfo() to skip DNS lookup entirely.
// Pseudocode
if (inet_pton(AF_INET, hostname, buf) == 1 ||
inet_pton(AF_INET6, hostname, buf) == 1) {
hints.ai_flags |= AI_NUMERICHOST;
}
API Reference
Core Functions
| Function | Signature | Description |
|---|---|---|
xDnsResolve | xDnsQuery xDnsResolve(xEventLoop loop, const char *hostname, const char *service, const struct addrinfo *hints, xDnsCallback callback, void *arg) | Start async DNS resolution |
xDnsCancel | void xDnsCancel(xEventLoop loop, xDnsQuery query) | Cancel a pending query |
xDnsResultFree | void xDnsResultFree(xDnsResult *result) | Free a resolution result |
Types
| Type | Description |
|---|---|
xDnsQuery | Opaque handle to a pending query |
xDnsResult | Resolution result: error + addrs linked list |
xDnsAddr | Single resolved address node |
xDnsCallback | void (*)(xDnsResult *result, void *arg) |
xDnsResult Fields
| Field | Type | Description |
|---|---|---|
error | xErrno | xErrno_Ok on success |
addrs | xDnsAddr * | Linked list of addresses, or NULL |
xDnsAddr Fields
| Field | Type | Description |
|---|---|---|
addr | struct sockaddr_storage | Resolved socket address |
addrlen | socklen_t | Length of the address |
family | int | AF_INET or AF_INET6 |
socktype | int | SOCK_STREAM or SOCK_DGRAM |
protocol | int | IPPROTO_TCP or IPPROTO_UDP |
next | xDnsAddr * | Next address, or NULL |
Parameter Details for xDnsResolve
| Parameter | Required | Description |
|---|---|---|
loop | Yes | Event loop (must not be NULL) |
hostname | Yes | Hostname or IP literal (non-empty) |
service | No | Port string (e.g. "443") or NULL |
hints | No | addrinfo hints; NULL defaults to AF_UNSPEC + SOCK_STREAM |
callback | Yes | Completion callback (must not be NULL) |
arg | No | User argument forwarded to callback |
Returns a xDnsQuery handle, or NULL on invalid arguments.
Usage Examples
Basic Resolution
#include <stdio.h>
#include <arpa/inet.h>
#include <xbase/event.h>
#include <xnet/dns.h>
static void on_resolved(xDnsResult *result, void *arg) {
xEventLoop loop = (xEventLoop)arg;
if (result->error != xErrno_Ok) {
fprintf(stderr, "DNS failed: %d\n", result->error);
xDnsResultFree(result);
xEventLoopStop(loop);
return;
}
for (xDnsAddr *a = result->addrs; a; a = a->next) {
char buf[INET6_ADDRSTRLEN];
if (a->family == AF_INET) {
struct sockaddr_in *sin = (struct sockaddr_in *)&a->addr;
inet_ntop(AF_INET, &sin->sin_addr, buf, sizeof(buf));
} else {
struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&a->addr;
inet_ntop(AF_INET6, &sin6->sin6_addr, buf, sizeof(buf));
}
printf(" %s (family=%d)\n", buf, a->family);
}
xDnsResultFree(result);
xEventLoopStop(loop);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xDnsResolve(loop, "example.com", "443", NULL, on_resolved, loop);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
IPv4-Only Resolution
struct addrinfo hints = {0};
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
xDnsResolve(loop, "example.com", "80", &hints, on_resolved, loop);```
### Cancelling a Query
```c
xDnsQuery q = xDnsResolve(loop, "slow.example.com", NULL, NULL, on_resolved, NULL);
// Cancel immediately — callback will NOT fire
xDnsCancel(loop, q);
IP Literal (No DNS Lookup)
// Resolves instantly via AI_NUMERICHOST
xDnsResolve(loop, "127.0.0.1", "8080", NULL, on_resolved, loop);
xDnsResolve(loop, "::1", "8080", NULL, on_resolved, loop);
Thread Safety
| Operation | Thread Safety |
|---|---|
xDnsResolve() | Call from event loop thread only |
xDnsCancel() | Call from event loop thread only |
xDnsResultFree() | Call from any thread (result is owned) |
xDnsCallback | Always invoked on event loop thread |
Error Handling
| Scenario | Behavior |
|---|---|
NULL loop, hostname, or callback | Returns NULL (no query created) |
| Empty hostname | Returns NULL |
malloc failure | Returns NULL |
getaddrinfo() failure | Callback receives result->error != xErrno_Ok |
| Cancelled query | Callback is not invoked; result is freed internally |
Best Practices
- Always call
xDnsResultFree()in your callback. The callee owns the result. - Check
result->errorbefore iteratingaddrs. On failure,addrsisNULL. - Use
xDnsCancel()for cleanup. If you destroy the object that owns the callback context, cancel the query first to prevent a use-after-free. - Pass
NULLhints for typical use. The defaults (AF_UNSPEC+SOCK_STREAM) cover most HTTP/WebSocket connection scenarios. xDnsCancel(loop, NULL)is safe — it's a no-op, so you don't need to guard against NULL handles.
tcp.h — Async TCP Connection, Connector & Listener
Introduction
tcp.h provides three async TCP building blocks on top of xKit's event loop:
- xTcpConn — a thin resource wrapper that pairs an
xSocketwith anxTransport, plus convenienceRecv/Send/SendIovhelpers. - xTcpConnect — an async connector that performs DNS → socket → non-blocking connect → optional TLS handshake, delivering a ready-to-use
xTcpConnvia callback. - xTcpListener — an async listener that accepts connections (with optional TLS) and delivers each as an
xTcpConn.
All callbacks run on the event loop thread, consistent with the rest of xKit.
Design Philosophy
-
Resource Wrapper, Not Callback Framework — Unlike
xWsCallbacks, we intentionally do not provideon_data/on_closecallbacks at the TCP layer. WebSocket callbacks work well because the protocol defines message boundaries, close handshakes, and ping/pong — the library does real work before invoking user code. Raw TCP is a byte stream with no framing; anon_datacallback would still deliver arbitrary fragments, leaving the user to reassemble and parse — no better than callingxTcpConnRecvdirectly. Instead, users register their ownxSocketFunccallback viaxSocketSetCallback()and drive I/O withxTcpConnRecv/xTcpConnSend. -
Transport Transparency —
xTcpConnwraps anxTransportvtable. For plain TCP,read/writevmap toread(2)/writev(2). For TLS, they map toSSL_read/SSL_write. TheRecv/Send/SendIovhelpers hide this detail so users never need to reach intoxTransportinternals. -
Full Async Connector Pipeline —
xTcpConnectchains DNS resolution → socket creation → non-blockingconnect()→ optional TLS handshake into a single async operation with a timeout. Each phase is driven by event loop callbacks. -
Ownership Transfer —
xTcpConnTakeSocketandxTcpConnTakeTransportallow higher-level protocols (e.g. WebSocket upgrade) to extract the underlying resources without closing them.
Architecture
Connector State Machine
stateDiagram-v2
[*] --> DNS: xTcpConnect()
DNS --> TcpConnect: resolved
DNS --> Failed: DNS error
TcpConnect --> TlsHandshake: connected + TLS configured
TcpConnect --> Succeed: connected (plain TCP)
TcpConnect --> Failed: connect error
TlsHandshake --> Succeed: handshake done
TlsHandshake --> Failed: handshake error
Succeed --> [*]: callback(conn, Ok)
Failed --> [*]: callback(NULL, err)
note right of DNS: Async via xDnsResolve
note right of TcpConnect: Non-blocking connect()
note right of TlsHandshake: Async SSL_do_handshake
Listener Accept Flow
sequenceDiagram
participant EL as Event Loop
participant L as xTcpListener
participant PC as PendingConn (TLS only)
participant App as User Callback
EL->>L: xEvent_Read (new connection)
L->>L: accept()
alt Plain TCP
L->>App: callback(listener, conn, addr)
else TLS
L->>PC: create PendingConn
loop Handshake rounds
EL->>PC: xEvent_Read / xEvent_Write
PC->>PC: SSL_do_handshake()
end
PC->>App: callback(listener, conn, addr)
end
xTcpConn Resource Ownership
graph LR
CONN["xTcpConn"]
SOCK["xSocket<br/>(event loop registration)"]
TP["xTransport<br/>(plain / TLS vtable)"]
FD["fd"]
CONN --> SOCK
CONN --> TP
SOCK --> FD
style CONN fill:#4a90d9,color:#fff
style SOCK fill:#50b86c,color:#fff
style TP fill:#f5a623,color:#fff
xTcpConnClose() destroys in order: transport → socket → conn shell. Use xTcpConnTakeSocket() / xTcpConnTakeTransport() to extract resources before closing.
API Reference
xTcpConn — Connection
| Function | Signature | Description |
|---|---|---|
xTcpConnRecv | ssize_t xTcpConnRecv(xTcpConn conn, void *buf, size_t len) | Read up to len bytes; returns bytes read, 0 on EOF, -1 on error |
xTcpConnSend | ssize_t xTcpConnSend(xTcpConn conn, const char *buf, size_t len) | Write len bytes; returns bytes written, -1 on error |
xTcpConnSendIov | ssize_t xTcpConnSendIov(xTcpConn conn, const struct iovec *iov, int iovcnt) | Scatter-gather write; returns total bytes written, -1 on error |
xTcpConnTransport | xTransport *xTcpConnTransport(xTcpConn conn) | Get the internal transport vtable |
xTcpConnSocket | xSocket xTcpConnSocket(xTcpConn conn) | Get the underlying socket handle |
xTcpConnTakeSocket | xSocket xTcpConnTakeSocket(xTcpConn conn) | Extract socket ownership (conn no longer owns it) |
xTcpConnTakeTransport | xTransport xTcpConnTakeTransport(xTcpConn conn) | Extract transport ownership (conn no longer owns it) |
xTcpConnReader | xReader xTcpConnReader(xTcpConn conn) | Get an xReader adapter bound to the connection's transport (see io.h) |
xTcpConnWriter | xWriter xTcpConnWriter(xTcpConn conn) | Get an xWriter adapter bound to the connection's transport (see io.h) |
xTcpConnClose | void xTcpConnClose(xEventLoop loop, xTcpConn conn) | Close connection and free all resources |
xTcpConnect — Async Connector
| Function | Signature | Description |
|---|---|---|
xTcpConnect | xErrno xTcpConnect(xEventLoop loop, const char *host, uint16_t port, const xTcpConnectConf *conf, xTcpConnectFunc callback, void *arg) | Initiate async TCP connection |
xTcpConnectConf Fields
| Field | Type | Default | Description |
|---|---|---|---|
tls_ctx | xTlsCtx | NULL | Pre-created shared TLS context (preferred); NULL for plain TCP or auto-create from tls |
tls | const xTlsConf * | NULL | TLS config for auto-created ctx; ignored when tls_ctx is set; NULL for plain TCP |
timeout_ms | int | 10000 | Connect timeout in milliseconds |
nodelay | int | 0 | Set TCP_NODELAY if non-zero |
keepalive | int | 0 | Set SO_KEEPALIVE if non-zero |
TLS context resolution order: tls_ctx (shared, not owned) → auto-create from tls → defaults (system CA, verify enabled). When tls_ctx is provided, the connector does not create or destroy the context — the caller retains ownership.
xTcpConnectFunc
typedef void (*xTcpConnectFunc)(xTcpConn conn, xErrno err, void *arg);
On success: conn is valid, err is xErrno_Ok. On failure: conn is NULL, err indicates the error.
xTcpListener — Async Listener
| Function | Signature | Description |
|---|---|---|
xTcpListenerCreate | xTcpListener xTcpListenerCreate(xEventLoop loop, const char *host, uint16_t port, const xTcpListenerConf *conf, xTcpListenerFunc callback, void *arg) | Create and start a TCP listener |
xTcpListenerDestroy | void xTcpListenerDestroy(xTcpListener listener) | Stop listening and free resources |
xTcpListenerConf Fields
| Field | Type | Default | Description |
|---|---|---|---|
tls_ctx | xTlsCtx | NULL | TLS context from xTlsCtxCreate(); NULL for plain TCP |
backlog | int | 128 | listen() backlog |
reuseport | int | 0 | Set SO_REUSEPORT if non-zero |
xTcpListenerFunc
typedef void (*xTcpListenerFunc)(xTcpListener listener, xTcpConn conn,
const struct sockaddr *addr, socklen_t addrlen,
void *arg);
Invoked for each accepted connection. The callee takes ownership of conn.
Usage Examples
Echo Server
#include <string.h>
#include <xbase/event.h>
#include <xbase/socket.h>
#include <xnet/tcp.h>
static void on_conn_event(xSocket sock, xEventMask mask, void *arg) {
xTcpConn conn = (xTcpConn)arg;
(void)sock;
if (mask & xEvent_Read) {
char buf[4096];
ssize_t n = xTcpConnRecv(conn, buf, sizeof(buf));
if (n > 0) {
xTcpConnSend(conn, buf, (size_t)n);
} else {
/* EOF or error: close */
xTcpConnClose(xSocketLoop(sock), conn);
}
}
}
static void on_accept(xTcpListener listener, xTcpConn conn,
const struct sockaddr *addr, socklen_t addrlen,
void *arg) {
(void)listener; (void)addr; (void)addrlen; (void)arg;
/* Register our own event callback on the connection's socket */
xSocket sock = xTcpConnSocket(conn);
xSocketSetCallback(sock, on_conn_event, conn);
/* Socket is already registered for xEvent_Read by default */
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xTcpListener listener =
xTcpListenerCreate(loop, "0.0.0.0", 8080, NULL, on_accept, NULL);
if (!listener) return 1;
xEventLoopRun(loop);
xTcpListenerDestroy(listener);
xEventLoopDestroy(loop);
return 0;
}
Async Client
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xbase/socket.h>
#include <xnet/tcp.h>
static void on_response(xSocket sock, xEventMask mask, void *arg) {
xTcpConn conn = (xTcpConn)arg;
xEventLoop loop = (xEventLoop)xSocketLoop(sock);
(void)mask;
char buf[4096];
ssize_t n = xTcpConnRecv(conn, buf, sizeof(buf));
if (n > 0) {
printf("Received: %.*s\n", (int)n, buf);
}
xTcpConnClose(loop, conn);
xEventLoopStop(loop);
}
static void on_connected(xTcpConn conn, xErrno err, void *arg) {
xEventLoop loop = (xEventLoop)arg;
if (err != xErrno_Ok) {
fprintf(stderr, "Connect failed: %d\n", err);
xEventLoopStop(loop);
return;
}
/* Send a request */
const char *msg = "Hello, server!";
xTcpConnSend(conn, msg, strlen(msg));
/* Wait for response */
xSocket sock = xTcpConnSocket(conn);
xSocketSetCallback(sock, on_response, conn);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xTcpConnectConf conf = {0};
conf.nodelay = 1;
xTcpConnect(loop, "127.0.0.1", 8080, &conf, on_connected, loop);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
TLS Client (auto-create context)
#include <xnet/tcp.h>
#include <xnet/tls.h>
static void on_tls_connected(xTcpConn conn, xErrno err, void *arg) {
if (err != xErrno_Ok) { /* handle error */ return; }
/* TLS is already established — Recv/Send are transparently encrypted */
const char *msg = "GET / HTTP/1.1\r\nHost: example.com\r\n\r\n";
xTcpConnSend(conn, msg, strlen(msg));
/* ... register read callback ... */
}
void connect_tls(xEventLoop loop) {
xTlsConf tls = {0};
tls.ca = "/etc/ssl/certs/ca-certificates.crt";
xTcpConnectConf conf = {0};
conf.tls = &tls;
xTcpConnect(loop, "example.com", 443, &conf, on_tls_connected, loop);
}
TLS Client (shared context)
When making many connections to the same server, share a xTlsCtx to avoid reloading certificates each time:
#include <xnet/tcp.h>
#include <xnet/tls.h>
static void on_connected(xTcpConn conn, xErrno err, void *arg) {
if (err != xErrno_Ok) { /* handle error */ return; }
/* ... use conn ... */
}
void connect_with_shared_ctx(xEventLoop loop) {
// Create once, reuse for all connections
xTlsConf tls = {0};
tls.ca = "ca.pem";
xTlsCtx ctx = xTlsCtxCreate(&tls);
xTcpConnectConf conf = {0};
conf.tls_ctx = ctx; // shared, not owned by connector
xTcpConnect(loop, "example.com", 443, &conf, on_connected, loop);
xTcpConnect(loop, "example.com", 443, &conf, on_connected, loop);
// ... later, after all connections are closed ...
xTlsCtxDestroy(ctx);
}
TLS Server
#include <xnet/tcp.h>
#include <xnet/transport.h>
void start_tls_server(xEventLoop loop) {
xTlsConf tls_conf = {
.cert = "server.pem",
.key = "server-key.pem",
};
xTlsCtx tls_ctx = xTlsCtxCreate(&tls_conf);
xTcpListenerConf conf = {0};
conf.tls_ctx = tls_ctx;
xTcpListener listener =
xTcpListenerCreate(loop, "0.0.0.0", 8443, &conf, on_accept, NULL);
/* ... run event loop ... */
xTcpListenerDestroy(listener);
xTlsCtxDestroy(tls_ctx);
}
Ownership Transfer (Protocol Upgrade)
/* After receiving an HTTP upgrade response on a TCP connection,
* extract the socket and transport for the new protocol layer. */
xSocket sock = xTcpConnTakeSocket(conn);
xTransport tp = xTcpConnTakeTransport(conn);
/* Close the empty conn shell (no-op on resources) */
xTcpConnClose(loop, conn);
/* sock and tp are now owned by the new protocol handler */
Thread Safety
| Operation | Thread Safety |
|---|---|
xTcpConnect() | Call from event loop thread only |
xTcpListenerCreate() | Call from event loop thread only |
xTcpListenerDestroy() | Call from event loop thread only |
xTcpConnRecv/Send/SendIov() | Call from event loop thread only |
xTcpConnClose() | Call from event loop thread only |
xTcpConnectFunc callback | Always invoked on event loop thread |
xTcpListenerFunc callback | Always invoked on event loop thread |
Error Handling
| Scenario | Behavior |
|---|---|
NULL loop, host, or callback in xTcpConnect | Returns xErrno_InvalidArg |
| DNS resolution failure | Callback receives xErrno_DnsError or xErrno_DnsNotFound |
connect() failure | Callback receives xErrno_SysError |
| TLS handshake failure | Callback receives xErrno_SysError |
| Connect timeout | Callback receives xErrno_Timeout |
xTcpListenerCreate bind/listen failure | Returns NULL |
xTcpConnRecv/Send on NULL conn | Returns -1 |
xTcpConnClose(loop, NULL) | No-op (safe) |
xTcpListenerDestroy(NULL) | No-op (safe) |
Best Practices
- Always close connections with
xTcpConnClose()— it destroys the transport (TLS cleanup), removes the socket from the event loop, closes the fd, and frees the conn. - Register your own
xSocketFuncon the connection's socket viaxSocketSetCallback()to receive read/write events, then usexTcpConnRecv/xTcpConnSendinside the callback. - Use
xTcpConnSendIovfor multi-buffer writes (e.g. header + body) to avoid copying into a single buffer. - Set
nodelay = 1inxTcpConnectConffor latency-sensitive protocols (HTTP, WebSocket). - Use
xTcpConnTakeSocket/xTcpConnTakeTransportwhen upgrading protocols (e.g. HTTP → WebSocket) to avoid double-free. - Cancel or close before freeing context — if you destroy the object that owns the connect callback context, ensure the connection attempt has completed or timed out first.
tls.h — TLS Configuration Types
Introduction
tls.h defines xTlsConf, the unified TLS configuration structure shared across xKit modules, and xTlsCtx, the opaque handle to a server-level TLS context. It controls certificate loading, peer verification, and optional ALPN negotiation for both client-side and server-side TLS. These are the central TLS abstractions — the actual TLS handshake is handled by the TLS backend (OpenSSL or mbedTLS) in the transport layer.
Design Philosophy
-
Backend-Agnostic — The config struct contains only file paths and flags. It works identically whether the TLS backend is OpenSSL or mbedTLS.
-
Zero-Initialize for Defaults — A zero-initialized
xTlsConfuses the system CA bundle with full peer and host verification enabled. This is the secure default for both client and server. -
Unified Client/Server — A single
xTlsConfstruct serves both roles. Client-only fields (key_password) and server-only fields (alpn) are simply left asNULL/ zero when unused. -
Separation of Concerns — TLS configuration is defined in xnet (the networking primitives layer) and consumed by xhttp (the HTTP layer). This avoids circular dependencies and allows future modules to reuse the same types.
API Reference
xTlsConf
Unified TLS configuration for both client and server.
| Field | Type | Default | Description |
|---|---|---|---|
cert | const char * | NULL (none) | Path to PEM certificate file |
key | const char * | NULL (none) | Path to PEM private key file |
ca | const char * | NULL (system CA) | Path to CA certificate file |
key_password | const char * | NULL (none) | Private key password (client-side) |
alpn | const char ** | NULL (none) | NULL-terminated ALPN protocol list (server-side) |
skip_verify | int | 0 (verify) | Non-zero to skip peer & host verification |
Backward-compatible aliases: xTlsClientConf and xTlsServerConf are typedef'd to xTlsConf.
xTlsCtx
Opaque handle to a shared TLS context. Created by xTlsCtxCreate(), used by both server-side listeners (xTcpListenerConf.tls_ctx) and client-side connectors (xTcpConnectConf.tls_ctx, xWsConnectConf.tls_ctx). Shared across all connections that use the same context. Destroyed by xTlsCtxDestroy(). Supports certificate hot-reload via xTlsCtxReload().
xTlsCtxCreate
xTlsCtx xTlsCtxCreate(const xTlsConf *conf);
Create a shared TLS context. Loads the certificate (if provided), private key (if provided), optional CA, and optional ALPN list. The returned context can be shared across all connections that use the same TLS configuration.
conf— TLS configuration (must not be NULL). For server-side use,certandkeyare required. For client-side use, onlyca(or defaults) is needed.- Returns a TLS context handle, or
NULLon failure.
xTlsCtxDestroy
void xTlsCtxDestroy(xTlsCtx ctx);
Destroy a shared TLS context and release all resources. Safe to call with NULL (no-op). Must only be called after all connections using this context have been closed.
xTlsCtxReload
int xTlsCtxReload(xTlsCtx ctx, const xTlsConf *conf);
Hot-reload certificates for an existing TLS context. Atomically replaces the certificate, private key, and optional CA. Existing connections are not affected; only new connections will use the updated certificates.
ctx— TLS context to reload (must not be NULL).conf— New TLS configuration (must not be NULL,certandkeymust not be NULL).- Returns
0on success,-1on failure (context unchanged).
Example: Certificate hot-reload
// Initial setup
xTlsConf tls = {
.cert = "server.pem",
.key = "server-key.pem",
.alpn = (const char *[]){"h2", "http/1.1", NULL},
};
xTlsCtx ctx = xTlsCtxCreate(&tls);
// ... later, when certificates are renewed ...
xTlsConf new_tls = {
.cert = "server-new.pem",
.key = "server-key-new.pem",
.alpn = (const char *[]){"h2", "http/1.1", NULL},
};
if (xTlsCtxReload(ctx, &new_tls) == 0) {
// New connections will use the updated certificates
}
One-Way TLS (Client Verifies Server)
#include <xnet/tls.h>
#include <xhttp/client.h>
// Use system CA bundle (zero-init)
xTlsConf tls = {0};
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);
// Or specify a CA file
xTlsConf tls_ca = {0};
tls_ca.ca = "ca.pem";
xHttpClientConf conf_ca = {.tls = &tls_ca};
xHttpClient client2 = xHttpClientCreate(loop, &conf_ca);
Skip Verification (Development Only)
xTlsConf tls = {0};
tls.skip_verify = 1; // DANGER: disables all checks
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);
Mutual TLS (mTLS)
// Server: require client certificate (default: verify enabled)
xTlsConf server_tls = {
.cert = "server.pem",
.key = "server-key.pem",
.ca = "ca.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &server_tls);
// Client: present certificate
xTlsConf client_tls = {0};
client_tls.ca = "ca.pem";
client_tls.cert = "client.pem";
client_tls.key = "client-key.pem";
xHttpClientConf client_conf = {
.tls = &client_tls,
};
xHttpClient client = xHttpClientCreate(loop, &client_conf);
Password-Protected Private Key
xTlsConf tls = {0};
tls.ca = "ca.pem";
tls.cert = "client.pem";
tls.key = "client-key-enc.pem";
tls.key_password = "my-secret";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);
Relationship with Other Modules
- xnet —
xTlsCtxCreate()/xTlsCtxDestroy()/xTlsCtxReload()are declared intls.hand implemented in the TLS backend files (transport_openssl.c,transport_mbedtls.c). The TCP listener usesxTlsCtxviaxTcpListenerConf.tls_ctx, and the TCP connector uses it viaxTcpConnectConf.tls_ctx. - xhttp — The HTTP server calls
xTlsCtxCreate()internally whenxHttpServerListenTls()is invoked, automatically setting ALPN to{"h2", "http/1.1"}. The HTTP client uses libcurl for TLS management and consumesxTlsConfdirectly. The WebSocket client supports bothxTlsConf(auto-creates a context) and a pre-createdxTlsCtx(shared across connections) viaxWsConnectConf.tls_ctx. See the TLS Deployment Guide for end-to-end examples.
Security Notes
- Never use
skip_verify = 1in production. It disables all certificate validation. - Keep private keys secure. Use restrictive file permissions (
chmod 600). - For mTLS, set
cato the signing CA on the server side. Zero-initializedskip_verifymeans verification is enabled by default. - The config struct does not copy strings. The caller must ensure that file path strings remain valid until
xHttpClientCreate()orxHttpServerListenTls()returns (the library deep-copies them internally).
xhttp — Asynchronous HTTP
Introduction
xhttp is xKit's HTTP module, providing both a fully asynchronous HTTP client and server, all powered by xbase's event loop.
- The client uses libcurl's multi-socket API for non-blocking HTTP requests and SSE streaming — ideal for integrating with REST APIs and LLM streaming endpoints. Supports TLS configuration including custom CA certificates, mutual TLS (mTLS), and certificate verification control via
xTlsConf. - The server uses an
xHttpProtovtable interface for protocol-abstracted parsing, supporting both HTTP/1.1 (llhttp) and HTTP/2 (nghttp2, h2c Prior Knowledge) on the same port. TLS listeners are supported viaxHttpServerListenTlswithxTlsConf. Single-threaded, event-driven connection handling — ideal for building lightweight HTTP services and APIs. - WebSocket support includes both server and client. On the server side, call
xWsUpgrade()inside a regular HTTP handler to perform the RFC 6455 upgrade handshake. On the client side, usexWsConnect()to establish an async WebSocket connection to a remote endpoint. The library handles frame codec, ping/pong, fragment reassembly, and close negotiation automatically for both sides.
Design Philosophy
-
Event Loop Integration — Instead of blocking threads, xhttp registers libcurl's sockets with
xEventLoopand uses event-driven I/O. All callbacks are dispatched on the event loop thread, eliminating the need for synchronization. -
Vtable-Based Request Polymorphism — Internally, different request types (oneshot HTTP, SSE streaming) share the same curl multi handle but use different vtables for completion and cleanup. This avoids code duplication while supporting diverse response handling patterns.
-
Zero-Copy Response Delivery — Response headers and body are accumulated in
xBufferinstances and delivered to the callback as pointers. No extra copies are made. -
Automatic Resource Management — Request contexts, curl easy handles, and buffers are automatically cleaned up after the completion callback returns. In-flight requests are cancelled with error callbacks when the client is destroyed.
Architecture
graph TD
subgraph "Application"
APP["User Code"]
end
subgraph "xhttp"
CLIENT["xHttpClient"]
TLS_CLI["TLS Config<br/>(xTlsConf)"]
ONESHOT["Oneshot Request<br/>(GET/POST/Do)"]
SSE["SSE Request<br/>(GetSse/DoSse)"]
PARSER["SSE Parser<br/>(W3C spec)"]
end
subgraph "libcurl"
MULTI["curl_multi"]
EASY1["curl_easy (req 1)"]
EASY2["curl_easy (req 2)"]
end
subgraph "xbase"
LOOP["xEventLoop"]
TIMER["Timer<br/>(curl timeout)"]
FD["FD Events<br/>(socket I/O)"]
end
APP -->|"xHttpClientGet/Post/Do"| ONESHOT
APP -->|"xHttpClientGetSse/DoSse"| SSE
APP -->|"xHttpClientConf.tls"| TLS_CLI
SSE --> PARSER
ONESHOT --> CLIENT
SSE --> CLIENT
TLS_CLI --> CLIENT
CLIENT --> MULTI
MULTI --> EASY1
MULTI --> EASY2
MULTI -->|"CURLMOPT_SOCKETFUNCTION"| FD
MULTI -->|"CURLMOPT_TIMERFUNCTION"| TIMER
FD --> LOOP
TIMER --> LOOP
style CLIENT fill:#4a90d9,color:#fff
style LOOP fill:#50b86c,color:#fff
style MULTI fill:#f5a623,color:#fff
Sub-Module Overview
| File | Description | Doc |
|---|---|---|
server.h | Async HTTP/1.1 & HTTP/2 server (routing, request/response, protocol-abstracted parsing) | server.md |
client.h | Async HTTP client API (GET, POST, Do, SSE, TLS configuration) | client.md |
sse.c | SSE stream parser and request handler | sse.md |
ws.h (server) | WebSocket server API (upgrade, send, close, callbacks) | ws_server.md |
ws.h (client) | WebSocket client API (connect, send, close, callbacks) | ws_client.md |
| (guide) | TLS deployment guide (certificate generation, one-way TLS, mTLS, troubleshooting) | tls.md |
Quick Start
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>
static void on_response(const xHttpResponse *resp, void *arg) {
(void)arg;
if (resp->curl_code == 0) {
printf("Status: %ld\n", resp->status_code);
printf("Body: %.*s\n", (int)resp->body_len, resp->body);
} else {
printf("Error: %s\n", resp->curl_error);
}
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpClient client = xHttpClientCreate(loop, NULL);
xHttpClientGet(client, "https://httpbin.org/get", on_response, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
Relationship with Other Modules
- xbase — Uses
xEventLoopfor I/O multiplexing andxEventLoopTimerAfterfor curl timeout management. - xbuf — Uses
xBufferfor response header and body accumulation. - libcurl — External dependency (client). Uses the multi-socket API (
curl_multi_socket_action) for non-blocking HTTP. - llhttp — External dependency (server). Provides incremental HTTP/1.1 request parsing, isolated behind the
xHttpProtovtable inproto_h1.c. - nghttp2 — External dependency (server). Provides HTTP/2 frame processing and HPACK header compression, isolated behind the
xHttpProtovtable inproto_h2.c.
client.h — Asynchronous HTTP Client
Introduction
client.h provides xHttpClient, an asynchronous HTTP client that integrates libcurl's multi-socket API with xbase's event loop. All network I/O is non-blocking and driven by the event loop; completion callbacks are dispatched on the event loop thread. The client supports GET, POST, PUT, DELETE, PATCH, HEAD methods and Server-Sent Events (SSE) streaming.
Design Philosophy
-
libcurl Multi-Socket Integration — Rather than using libcurl's easy (blocking) API or multi-perform (polling) API, xhttp uses the multi-socket API (
CURLMOPT_SOCKETFUNCTION+CURLMOPT_TIMERFUNCTION). This allows libcurl to delegate socket monitoring to xEventLoop, achieving true event-driven I/O without dedicated threads. -
Single-Threaded Callback Model — All callbacks (response, SSE events, done) are invoked on the event loop thread. No locks are needed in callback code.
-
Vtable-Based Polymorphism — Internally, each request carries a vtable (
xHttpReqVtable) withon_doneandon_cleanupfunction pointers. Oneshot requests and SSE requests use different vtables, sharing the same curl multi handle and completion infrastructure. -
Automatic Body Copy — POST/PUT request bodies are copied internally (
malloc+memcpy), so the caller doesn't need to keep the body alive after submitting the request.
Architecture
graph TD
subgraph xHttpClientInternal[xHttpClient Internal]
MULTI[curl multi handle]
TIMER_CB[timer callback - CURLMOPT TIMERFUNCTION]
SOCKET_CB[socket callback - CURLMOPT SOCKETFUNCTION]
CHECK[check multi info]
end
subgraph PerRequest[Per Request]
REQ[xHttpReq]
EASY[curl easy handle]
BODY[xBuffer body]
HDR[xBuffer headers]
VT[vtable - oneshot or SSE]
end
subgraph xbaseEventLoop[xbase Event Loop]
LOOP[xEventLoop]
FD_EVT[FD events]
TIMER_EVT[Timer events]
end
SOCKET_CB --> FD_EVT
TIMER_CB --> TIMER_EVT
FD_EVT --> LOOP
TIMER_EVT --> LOOP
LOOP -->|fd ready| CHECK
LOOP -->|timeout| CHECK
CHECK --> VT
VT -->|on done| APP[User Callback]
REQ --> EASY
REQ --> BODY
REQ --> HDR
REQ --> VT
style MULTI fill:#f5a623,color:#fff
style LOOP fill:#50b86c,color:#fff
Implementation Details
libcurl + xEventLoop Integration
sequenceDiagram
participant App as Application
participant Client as xHttpClient
participant Curl as CurlMulti
participant L as xEventLoop
App->>Client: xHttpClientGet url cb
Client->>Curl: curl multi add handle
Curl->>Client: socket callback fd POLL IN
Client->>L: xEventAdd fd Read
Note over L: Event loop polls
L->>Client: fd ready callback
Client->>Curl: curl multi socket action
Curl->>Client: write callback data
Client->>Client: xBufferAppend body buf data
Note over Curl: Transfer complete
Client->>Client: check multi info
Client->>App: on response resp
Socket Callback Flow
When libcurl needs to monitor a socket, it calls socket_callback:
- CURL_POLL_REMOVE — Unregister the fd from the event loop (
xEventDel). - CURL_POLL_IN/OUT/INOUT — Register or update the fd with the event loop (
xEventAdd/xEventMod).
Each socket gets an xHttpSocketCtx_ that maps the fd to the client and event source.
Timer Callback Flow
When libcurl needs a timeout:
- timeout_ms == -1 — Cancel any existing timer.
- timeout_ms == 0 — Schedule a 1ms timer (deferred to avoid reentrant
curl_multi_socket_action). - timeout_ms > 0 — Schedule a timer via
xEventLoopTimerAfter.
When the timer fires, curl_multi_socket_action(CURL_SOCKET_TIMEOUT) is called.
Request Lifecycle
stateDiagram-v2
[*] --> Created: xHttpClientGet/Post/Do
Created --> Submitted: curl_multi_add_handle
Submitted --> InFlight: Event loop drives I/O
InFlight --> Completed: curl reports CURLMSG_DONE
Completed --> CallbackInvoked: on_response(resp)
CallbackInvoked --> CleanedUp: free buffers + easy handle
CleanedUp --> [*]
InFlight --> Aborted: xHttpClientDestroy
Aborted --> CallbackInvoked: on_response(error)
Response Structure
XDEF_STRUCT(xHttpResponse) {
long status_code; // HTTP status (200, 404, etc.), 0 on failure
const char *headers; // Raw headers (NUL-terminated)
size_t headers_len;
const char *body; // Response body (NUL-terminated)
size_t body_len;
int curl_code; // CURLcode (0 = success)
const char *curl_error; // Human-readable error, or NULL
};
All pointers are valid only during the callback. The library manages their lifetime.
API Reference
Types
| Type | Description |
|---|---|
xHttpClient | Opaque handle to an HTTP client bound to an event loop |
xHttpClientConf | Configuration struct for creating a client (TLS, HTTP version) |
xHttpResponse | Response data delivered to the completion callback |
xHttpResponseFunc | void (*)(const xHttpResponse *resp, void *arg) |
xHttpMethod | Enum: GET, POST, PUT, DELETE, PATCH, HEAD |
xHttpRequestConf | Configuration struct for generic requests |
xSseEvent | SSE event data delivered to the event callback |
xSseEventFunc | int (*)(const xSseEvent *ev, void *arg) — return 0 to continue, non-zero to close |
xSseDoneFunc | void (*)(int curl_code, void *arg) |
xTlsConf | TLS configuration for the client (CA path, client cert/key, skip verify) |
Lifecycle
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHttpClientCreate | xHttpClient xHttpClientCreate(xEventLoop loop, const xHttpClientConf *conf) | Create a client bound to an event loop. Pass NULL for defaults. | Not thread-safe |
xHttpClientDestroy | void xHttpClientDestroy(xHttpClient client) | Destroy client. In-flight requests get error callbacks. | Not thread-safe |
TLS Configuration
TLS is configured at client creation time via
xHttpClientConf. The xTlsConf fields are
deep-copied internally; the caller does not need to
keep them alive after creation.
xTlsConf Fields (Client)
| Field | Type | Description |
|---|---|---|
ca | const char * | Path to a CA certificate file for server verification. When set, the system CA bundle is bypassed. |
cert | const char * | Path to a client certificate file (PEM) for mutual TLS (mTLS). |
key | const char * | Path to the client private key file (PEM) for mTLS. |
key_password | const char * | Passphrase for an encrypted client private key. |
skip_verify | int | If non-zero, skip server certificate verification (useful for self-signed certs in development). |
All string fields are deep-copied internally; the caller does not need to keep them alive after the call.
Convenience Requests
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHttpClientGet | xErrno xHttpClientGet(xHttpClient client, const char *url, xHttpResponseFunc on_response, void *arg) | Async GET request. | Not thread-safe |
xHttpClientPost | xErrno xHttpClientPost(xHttpClient client, const char *url, const char *body, size_t body_len, xHttpResponseFunc on_response, void *arg) | Async POST request. Body is copied internally. | Not thread-safe |
Generic Request
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHttpClientDo | xErrno xHttpClientDo(xHttpClient client, const xHttpRequestConf *config, xHttpResponseFunc on_response, void *arg) | Fully-configured async request. | Not thread-safe |
SSE Requests
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHttpClientGetSse | xErrno xHttpClientGetSse(xHttpClient client, const char *url, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg) | Subscribe to SSE endpoint (GET). | Not thread-safe |
xHttpClientDoSse | xErrno xHttpClientDoSse(xHttpClient client, const xHttpRequestConf *config, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg) | Fully-configured SSE request (e.g., POST for LLM APIs). | Not thread-safe |
Usage Examples
Simple GET Request
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>
static void on_response(const xHttpResponse *resp, void *arg) {
(void)arg;
if (resp->curl_code == 0) {
printf("HTTP %ld\n", resp->status_code);
printf("%.*s\n", (int)resp->body_len, resp->body);
} else {
printf("Error: %s\n", resp->curl_error);
}
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpClient client = xHttpClientCreate(loop, NULL);
xHttpClientGet(client, "https://httpbin.org/get", on_response, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
HTTPS with TLS Configuration
#include <xbase/event.h>
#include <xhttp/client.h>
static void on_response(const xHttpResponse *resp,
void *arg) {
(void)arg;
printf("Status: %ld\n", resp->status_code);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
// Skip certificate verification (dev only)
xTlsConf tls = {0};
tls.skip_verify = 1;
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
xHttpClientCreate(loop, &conf);
xHttpClientGet(
client,
"https://secure.example.com/api",
on_response, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
POST with Custom Headers
#include <xbase/event.h>
#include <xhttp/client.h>
static void on_response(const xHttpResponse *resp, void *arg) {
(void)arg;
printf("Status: %ld, Body: %.*s\n",
resp->status_code, (int)resp->body_len, resp->body);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpClient client = xHttpClientCreate(loop, NULL);
const char *headers[] = {
"Content-Type: application/json",
"Authorization: Bearer token123",
NULL
};
xHttpRequestConf config = {
.url = "https://api.example.com/data",
.method = xHttpMethod_POST,
.body = "{\"key\": \"value\"}",
.body_len = 16,
.headers = headers,
.timeout_ms = 5000,
};
xHttpClientDo(client, &config, on_response, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
Use Cases
-
REST API Integration — Make async HTTP calls to microservices, cloud APIs, or webhooks from an event-driven C application.
-
Secure Communication — Pass TLS config via
xHttpClientConfat creation time to configure custom CA certificates, client certificates for mTLS, or skip verification for development environments with self-signed certs. -
LLM API Calls — Use
xHttpClientDoSse()with POST method and JSON body to stream responses from OpenAI, Anthropic, or other LLM APIs. See sse.md for a complete example. -
Health Checks / Monitoring — Periodically poll HTTP endpoints using timer-driven GET requests within the event loop.
Best Practices
- Don't block in callbacks. Callbacks run on the event loop thread. Blocking delays all other I/O.
- Copy data you need to keep. Response pointers (
body,headers) are only valid during the callback. - Use
xHttpClientDo()for complex requests. The convenience helpers (Get/Post) are for simple cases;Dogives full control over method, headers, body, and timeout. - Destroy the client before the event loop.
xHttpClientDestroy()cancels in-flight requests and invokes their callbacks with error status. - Check
curl_codefirst. Acurl_codeof 0 means the HTTP transfer succeeded; then checkstatus_codefor the HTTP-level result. - Never use
skip_verifyin production. It disables all certificate validation. Use a proper CA path or system CA bundle instead. - TLS config is set at creation time. Pass
xHttpClientConfwith TLS settings when creating the client; it affects both oneshot and SSE requests. To change TLS config, destroy and recreate the client.
Comparison with Other Libraries
| Feature | xhttp client.h | libcurl easy API | cpp-httplib | Python requests |
|---|---|---|---|---|
| I/O Model | Async (event loop) | Blocking | Blocking | Blocking |
| Event Loop | xEventLoop integration | None (or manual multi) | None | None (asyncio separate) |
| SSE Support | Built-in (GetSse/DoSse) | Manual parsing | No | No (needs sseclient) |
| TLS Config | xHttpClientConf.tls at creation | curl_easy_setopt (manual) | Built-in | verify/cert params |
| Thread Model | Single-threaded callbacks | One thread per request | One thread per request | One thread per request |
| Memory | Automatic (xBuffer) | Manual (WRITEFUNCTION) | Automatic (std::string) | Automatic (Python GC) |
| Language | C99 | C | C++ | Python |
Key Differentiator: xhttp provides true event-loop-integrated async HTTP with built-in SSE support. Unlike libcurl's easy API (which blocks) or multi-perform API (which requires polling), xhttp uses the multi-socket API for zero-overhead integration with xEventLoop. The built-in SSE parser makes it uniquely suited for LLM API integration from C.
server.h — Asynchronous HTTP/1.1 & HTTP/2 Server
Introduction
server.h provides xHttpServer, an asynchronous, non-blocking HTTP server powered by xbase's event loop. The server supports both HTTP/1.1 and HTTP/2 (h2c, cleartext) on the same port, with automatic protocol detection via Prior Knowledge. The protocol parsing layer is abstracted behind an xHttpProto vtable interface — HTTP/1.1 uses llhttp, HTTP/2 uses nghttp2. All connection handling, request parsing, and response sending are driven by the event loop on a single thread — no locks or thread pools required. The server supports routing, keep-alive, configurable limits, automatic error responses, and TLS/HTTPS via xHttpServerListenTls() with pluggable TLS backends (OpenSSL or Mbed TLS).
Design Philosophy
-
Single-Threaded Event-Driven I/O — The server registers listening and client sockets with
xEventLoop. Accept, read, parse, dispatch, and write all happen on the event loop thread, eliminating synchronization overhead. -
Protocol-Abstracted Parsing — Request parsing is delegated to a protocol handler behind the
xHttpProtovtable interface. HTTP/1.1 (proto_h1.c) uses llhttp; HTTP/2 (proto_h2.c) uses nghttp2. Incremental callbacks accumulate URL, headers, and body intoxBufferinstances. This abstraction allows both protocols to share the same connection management, routing, and response serialization layers. -
Automatic Protocol Detection — On each new connection, the server inspects the first bytes of incoming data. If the 24-byte HTTP/2 connection preface (
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n) is detected, the connection is upgraded to HTTP/2; otherwise, HTTP/1.1 is used. This enables h2c (cleartext HTTP/2) via Prior Knowledge — ideal for internal service-to-service communication. -
First-Match Routing — Routes are registered as pattern strings (e.g.
"GET /users/:id"or"/any") and matched in registration order. If the pattern starts with/, it matches any HTTP method; otherwise the first token is the method. Path patterns support both exact segments and:paramsegments. -
Writer-Based Response API — Handlers receive an
xHttpResponseWriterhandle to set status, headers, and body. The response is serialized into anxIOBufferand flushed asynchronously, with backpressure handled automatically. -
Defensive Limits — Configurable limits on header size (default 8 KiB), body size (default 1 MiB), and idle timeout (default 60 s) protect against slow clients and oversized payloads. Violations produce appropriate 4xx error responses.
-
Pluggable TLS — TLS support is provided via
xHttpServerListenTls()withxTlsConf. The TLS backend (OpenSSL or Mbed TLS) is selected at compile time viaXK_TLS_BACKEND. ALPN negotiation automatically selects HTTP/1.1 or HTTP/2 over TLS. Mutual TLS (mTLS) is supported whencais set (verification is enabled by default).
Architecture
graph TD
subgraph "Application"
APP["User Code"]
HANDLER["Handler Callback"]
end
subgraph "xhttp Server"
SERVER["xHttpServer"]
TLS["TLS Layer<br/>(OpenSSL / Mbed TLS)"]
ROUTER["Route Table<br/>(linked list)"]
CONN["xHttpConn_<br/>(per connection)"]
DETECT["Protocol Detection<br/>(Prior Knowledge / ALPN)"]
PROTO["xHttpProto (vtable)"]
PARSER_H1["proto_h1 (llhttp)"]
PARSER_H2["proto_h2 (nghttp2)"]
STREAM["xHttpStream_<br/>(per request)"]
WRITER["xHttpResponseWriter"]
end
subgraph "xbase"
LOOP["xEventLoop"]
SOCK["xSocket"]
TIMER["Idle Timeout"]
end
APP -->|"xHttpServerRoute"| ROUTER
APP -->|"xHttpServerListen<br/>xHttpServerListenTls"| SERVER
SERVER -->|"accept()"| CONN
SERVER -.->|"TLS handshake"| TLS
TLS -.-> CONN
CONN --> DETECT
DETECT -->|"H1"| PARSER_H1
DETECT -->|"H2 preface"| PARSER_H2
PARSER_H1 --> PROTO
PARSER_H2 --> PROTO
PROTO -->|"request complete"| STREAM
STREAM --> ROUTER
ROUTER -->|"first match"| HANDLER
HANDLER -->|"xHttpResponseSend"| WRITER
WRITER --> STREAM
STREAM -->|"H1: xIOBuffer / H2: nghttp2 frames"| CONN
CONN --> SOCK
SOCK --> LOOP
TIMER --> LOOP
style SERVER fill:#4a90d9,color:#fff
style LOOP fill:#50b86c,color:#fff
style PROTO fill:#9b59b6,color:#fff
style PARSER_H1 fill:#f5a623,color:#fff
style PARSER_H2 fill:#e74c3c,color:#fff
style DETECT fill:#1abc9c,color:#fff
style TLS fill:#2ecc71,color:#fff
Implementation Details
Connection Lifecycle
stateDiagram-v2
[*] --> Accepted: accept() on listen fd
Accepted --> Reading: xSocket registered (Read)
Reading --> Parsing: Data received
Parsing --> Dispatching: on_message_complete
Dispatching --> HandlerRunning: Route matched
Dispatching --> ErrorSent: No match (404/405)
HandlerRunning --> ResponseQueued: xHttpResponseSend()
ResponseQueued --> Flushing: conn_try_flush()
Flushing --> KeepAlive: All written + keep-alive
Flushing --> Backpressure: EAGAIN (register Write)
Backpressure --> Flushing: Write event fires
KeepAlive --> Reading: Reset parser state
Flushing --> Closed: All written + !keep-alive
ErrorSent --> Closed: Error responses close connection
Reading --> Closed: Idle timeout
Reading --> Closed: Client disconnect
Reading --> Closed: Parse error (400)
Parsing --> ErrorSent: Header too large (431)
Parsing --> ErrorSent: Body too large (413)
Request Parsing Flow
sequenceDiagram
participant Client
participant Conn as xHttpConn_
participant Proto as xHttpProto (vtable)
participant Parser as proto_h1 (llhttp)
participant Bufs as xBuffer (url/headers/body)
participant Router as Route Table
participant Handler as User Handler
Client->>Conn: TCP data
Conn->>Conn: xIOBufferReadFd()
Conn->>Proto: proto.on_data(data)
Proto->>Parser: llhttp_execute(data)
Parser->>Bufs: on_url → xBufferAppend(url)
Parser->>Bufs: on_header_field → xBufferAppend(headers_raw)
Parser->>Bufs: on_header_value → xBufferAppend(headers_raw)
Parser->>Bufs: on_body → xBufferAppend(body)
Parser->>Proto: on_message_complete → return 1
Proto->>Conn: return 1 (request complete)
Conn->>Router: conn_dispatch_request()
Router->>Handler: handler(writer, req, arg)
Handler->>Conn: xHttpResponseSend(body)
Conn->>Client: HTTP response (async flush)
Routing
Routes are stored in a singly-linked list and matched in registration order (first match wins):
- Path match — Segment-by-segment comparison. Static segments require exact match;
:paramsegments match any non-empty string and capture the value. - Method match — Case-insensitive comparison (
strcasecmp). A pattern without a method prefix (e.g."/any") matches any HTTP method. - Fallback — If the path matches but no method matches → 405 Method Not Allowed. If no path matches → 404 Not Found.
- Parameter access — Inside a handler, call
xHttpRequestParam(req, "id", &len)to retrieve the captured value.
Response Serialization
When xHttpResponseSend() is called:
- Status line (
HTTP/1.1 <code> <reason>\r\n) is written to thexIOBuffer. Content-Lengthheader is added automatically.Connection: keep-aliveorConnection: closeis added based on the parser's determination.- User-set headers are appended.
- Header section is terminated with
\r\n. - Body is appended.
conn_try_flush()attempts an immediatewritev(). IfEAGAIN, the socket is registered for write events and flushing continues asynchronously.
Keep-Alive & Pipelining
- HTTP/1.1 connections default to keep-alive. After a response is fully flushed,
proto.reset()is called and the connection waits for the next request. - The parser is paused in
on_message_completeto prevent parsing the next pipelined request before the current response is sent. - Error responses always set
Connection: close.
HTTP/2 Support (h2c Prior Knowledge)
The server supports cleartext HTTP/2 (h2c) via the Prior Knowledge mechanism. HTTP/1.1 and HTTP/2 coexist on the same port — no TLS or Upgrade header required.
Protocol Detection
When a new connection is accepted, protocol detection is deferred until the first bytes arrive:
- If the first 24 bytes match the HTTP/2 connection preface (
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n),xHttpProtoH2Init()is called. - If the prefix doesn't match,
xHttpProtoH1Init()is called. - If fewer than 24 bytes have arrived but the prefix still matches so far, the server waits for more data before deciding.
Stream Multiplexing
Under HTTP/2, a single TCP connection carries multiple concurrent streams, each representing an independent request/response exchange:
xHttpStream_— Per-request state (URL, headers, body, response writer). HTTP/1.1 uses a single implicit stream (stream_id = 0); HTTP/2 creates a new stream for each request.- Deferred dispatch — Completed streams are queued during
nghttp2_session_mem_recv()and dispatched after it returns, avoiding re-entrancy issues. - Response framing — Responses are submitted via
nghttp2_submit_response()with HPACK-compressed headers and DATA frames, then flushed through the connection's write buffer.
H2 Connection Lifecycle
sequenceDiagram
participant Client
participant Conn as xHttpConn_
participant Detect as Protocol Detection
participant H2 as proto_h2 (nghttp2)
participant Stream as xHttpStream_
participant Router as Route Table
participant Handler as User Handler
Client->>Conn: TCP connect
Client->>Conn: H2 connection preface + SETTINGS
Conn->>Detect: First bytes inspection
Detect->>H2: xHttpProtoH2Init()
H2->>Client: SETTINGS frame (server preface)
Client->>Conn: HEADERS frame (stream 1, :method=GET, :path=/hello)
Conn->>H2: h2_on_data()
H2->>Stream: Create stream (id=1)
H2->>Stream: Accumulate headers
H2->>Router: Dispatch (END_STREAM received)
Router->>Handler: handler(writer, req, arg)
Handler->>Stream: xHttpResponseSend(body)
Stream->>H2: nghttp2_submit_response()
H2->>Client: HEADERS + DATA frames
Key Differences: H1 vs H2
| Feature | HTTP/1.1 (proto_h1) | HTTP/2 (proto_h2) |
|---|---|---|
| Parser | llhttp (byte stream → request) | nghttp2 (byte stream → frame → stream) |
| Multiplexing | None (pipelining at best) | Native, multiple concurrent streams |
| Headers | Plain text Key: Value | HPACK compressed pseudo-headers + regular headers |
| Keep-alive | Connection: keep-alive header | Always persistent (multiplexed) |
| Reset | Per-request proto.reset() | No-op (streams are independent) |
| Response framing | Raw HTTP/1.1 status line + headers + body | nghttp2_submit_response() → HEADERS + DATA frames |
| Flow control | None | Built-in per-stream flow control |
Limitations
- h2 over TLS — TLS-based HTTP/2 (h2 with ALPN) is supported via
xHttpServerListenTls(). Cleartext h2c uses Prior Knowledge. - No server push — HTTP/2 server push is not implemented.
- Streaming responses —
xHttpResponseWrite()/xHttpResponseEnd()for HTTP/2 streaming DATA frames is not yet fully implemented.
Idle Timeout
Each connection has an idle timeout (default 60 s). If no data is received within this period, the connection is closed automatically via xEvent_Timeout. The timeout is reset after each response is sent on a keep-alive connection.
API Reference
Types
| Type | Description |
|---|---|
xHttpServer | Opaque handle to an HTTP server bound to an event loop |
xHttpResponseWriter | Opaque handle to a response writer (valid only during handler) |
xHttpRequest | Request data delivered to the handler callback |
xHttpHandlerFunc | void (*)(xHttpResponseWriter writer, const xHttpRequest *req, void *arg) |
xTlsConf | TLS configuration for HTTPS listeners (cert, key, CA, skip_verify) |
xHttpRequest Fields
| Field | Type | Description |
|---|---|---|
method | const char * | HTTP method string (e.g. "GET", "POST") |
url | const char * | Request URL / path (NUL-terminated) |
headers | const char * | Raw request headers (NUL-terminated) |
headers_len | size_t | Length of headers in bytes |
body | const char * | Request body, or NULL if no body |
body_len | size_t | Length of body in bytes |
All pointers are valid only for the duration of the handler callback.
Lifecycle
| Function | Signature | Description |
|---|---|---|
xHttpServerCreate | xHttpServer xHttpServerCreate(xEventLoop loop) | Create a server bound to an event loop. |
xHttpServerListen | xErrno xHttpServerListen(xHttpServer server, const char *host, uint16_t port) | Start listening on the given address and port. |
xHttpServerListenTls | xErrno xHttpServerListenTls(xHttpServer server, const char *host, uint16_t port, const xTlsConf *config) | Start listening for HTTPS connections with TLS. ALPN selects H1/H2. Can coexist with Listen on a different port. Returns xErrno_NotSupported if no TLS backend was compiled. |
xHttpServerDestroy | void xHttpServerDestroy(xHttpServer server) | Destroy server, close all connections, free all routes. |
Route Registration
| Function | Signature | Description |
|---|---|---|
xHttpServerRoute | xErrno xHttpServerRoute(xHttpServer server, const char *pattern, xHttpHandlerFunc handler, void *arg) | Register a route. pattern combines method and path: "GET /users/:id" matches only GET; "/users/:id" matches all methods. Path supports :param segments. First match wins. |
Request Parameters
| Function | Signature | Description |
|---|---|---|
xHttpRequestParam | const char *xHttpRequestParam(const xHttpRequest *req, const char *name, size_t *len) | Look up a path parameter by name. Returns a pointer to the value (NOT NUL-terminated) and sets *len, or returns NULL if not found. |
Response
| Function | Signature | Description |
|---|---|---|
xHttpResponseSetStatus | void xHttpResponseSetStatus(xHttpResponseWriter writer, int code) | Set HTTP status code (default 200). |
xHttpResponseSetHeader | xErrno xHttpResponseSetHeader(xHttpResponseWriter writer, const char *key, const char *value) | Add a response header. Call before Send or the first Write. |
xHttpResponseSend | xErrno xHttpResponseSend(xHttpResponseWriter writer, const char *body, size_t body_len) | Send a complete response. May only be called once. Mutually exclusive with Write. |
xHttpResponseWrite | xErrno xHttpResponseWrite(xHttpResponseWriter writer, const char *data, size_t len) | Write data to a streaming response. First call flushes headers (no Content-Length). Mutually exclusive with Send. |
xHttpResponseEnd | void xHttpResponseEnd(xHttpResponseWriter writer) | End a streaming response. Optional — auto-called when the handler returns. |
Configuration
| Function | Signature | Description | Default |
|---|---|---|---|
xHttpServerSetIdleTimeout | xErrno xHttpServerSetIdleTimeout(xHttpServer server, int timeout_ms) | Set idle timeout for connections. | 60000 ms |
xHttpServerSetMaxHeaderSize | xErrno xHttpServerSetMaxHeaderSize(xHttpServer server, size_t max_size) | Set max header size. Exceeding → 431. | 8192 bytes |
xHttpServerSetMaxBodySize | xErrno xHttpServerSetMaxBodySize(xHttpServer server, size_t max_size) | Set max body size. Exceeding → 413. | 1048576 bytes |
All configuration functions must be called before xHttpServerListen() / xHttpServerListenTls().
TLS Configuration
xTlsConf Fields (Server)
| Field | Type | Description |
|---|---|---|
cert | const char * | Path to PEM certificate file (required). |
key | const char * | Path to PEM private key file (required). |
ca | const char * | Path to CA certificate file for client verification (optional). |
skip_verify | int | If non-zero, skip peer verification. Default 0 (verify enabled). |
When ca is set and skip_verify is 0 (default), the server performs mutual TLS (mTLS) — clients must present a valid certificate signed by the specified CA.
Usage Examples
Minimal Server
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSetHeader(w, "Content-Type", "text/plain");
xHttpResponseSend(w, "Hello, World!\n", 14);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /hello", on_hello, NULL);
xHttpServerListen(server, "0.0.0.0", 8080);
printf("Listening on :8080\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
JSON API with POST
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_echo(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)arg;
xHttpResponseSetHeader(w, "Content-Type", "application/json");
xHttpResponseSend(w, req->body, req->body_len);
}
static void on_not_found(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
const char *body = "{\"error\": \"not found\"}";
xHttpResponseSetStatus(w, 404);
xHttpResponseSetHeader(w, "Content-Type", "application/json");
xHttpResponseSend(w, body, strlen(body));
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerSetMaxBodySize(server, 4 * 1024 * 1024); /* 4 MiB */
xHttpServerRoute(server, "POST /echo", on_echo, NULL);
xHttpServerListen(server, NULL, 9090);
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
Server-Sent Events (SSE)
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_events(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSetHeader(w, "Content-Type", "text/event-stream");
xHttpResponseSetHeader(w, "Cache-Control", "no-cache");
xHttpResponseWrite(w, "data: hello\n\n", 13);
xHttpResponseWrite(w, "data: world\n\n", 13);
/* xHttpResponseEnd(w) is optional; auto-called on return */
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /events", on_events, NULL);
xHttpServerListen(server, NULL, 8080);
printf("SSE server on :8080/events\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
RESTful API with Path Parameters
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_get_user(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)arg;
size_t id_len = 0;
const char *id = xHttpRequestParam(req, "id", &id_len);
char body[128];
int len = snprintf(body, sizeof(body),
"{\"user_id\": \"%.*s\"}\n", (int)id_len, id);
xHttpResponseSetHeader(w, "Content-Type", "application/json");
xHttpResponseSend(w, body, (size_t)len);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /users/:id", on_get_user, NULL);
xHttpServerListen(server, NULL, 8080);
printf("REST API on :8080\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
HTTPS Server
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSetHeader(w, "Content-Type", "text/plain");
xHttpResponseSend(w, "Hello, HTTPS!\n", 14);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /hello", on_hello, NULL);
// TLS configuration
xTlsConf tls = {
.cert = "/path/to/server.pem",
.key = "/path/to/server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
printf("HTTPS server on :8443\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
HTTPS Server with Mutual TLS (mTLS)
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_secure(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSetHeader(w, "Content-Type", "text/plain");
xHttpResponseSend(w, "mTLS verified!\n", 15);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /secure", on_secure, NULL);
// Require client certificates
xTlsConf tls = {
.cert = "/path/to/server.pem",
.key = "/path/to/server-key.pem",
.ca = "/path/to/ca.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
printf("mTLS server on :8443\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
HTTP + HTTPS on Different Ports
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSend(w, "Hello!\n", 7);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /hello", on_hello, NULL);
// Serve HTTP on port 8080
xHttpServerListen(server, "0.0.0.0", 8080);
// Serve HTTPS on port 8443
xTlsConf tls = {
.cert = "/path/to/server.pem",
.key = "/path/to/server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
printf("HTTP on :8080, HTTPS on :8443\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
Multiple Routes with Shared State
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>
typedef struct {
int counter;
} AppState;
static void on_count(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req;
AppState *state = (AppState *)arg;
state->counter++;
char body[64];
int len = snprintf(body, sizeof(body), "{\"count\": %d}\n", state->counter);
xHttpResponseSetHeader(w, "Content-Type", "application/json");
xHttpResponseSend(w, body, (size_t)len);
}
static void on_health(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSend(w, "ok\n", 3);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
AppState state = { .counter = 0 };
xHttpServerRoute(server, "POST /count", on_count, &state);
xHttpServerRoute(server, "GET /health", on_health, NULL);
xHttpServerListen(server, NULL, 8080);
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
Best Practices
- Don't block in handlers. Handlers run on the event loop thread. Blocking delays all other connections.
- Always call
xHttpResponseSend()orxHttpResponseWrite(). If the handler returns without sending, a default 200 OK with empty body is sent automatically — but it's better to be explicit. - Don't mix
SendandWrite.xHttpResponseSend()is for one-shot responses;xHttpResponseWrite()is for streaming. They are mutually exclusive — calling one after the other returnsxErrno_InvalidState. - Configure limits before listening.
SetIdleTimeout,SetMaxHeaderSize, andSetMaxBodySizemust be called beforexHttpServerListen()/xHttpServerListenTls(). - Register routes before listening. Routes should be set up before the server starts accepting connections.
- Use
xHttpServerListenTls()for HTTPS. Provide valid PEM certificate and key files. For mTLS, setca(verification is enabled by default). - Serve HTTP and HTTPS on different ports. Call both
xHttpServerListen()andxHttpServerListenTls()on the same server instance to support both protocols simultaneously. - Destroy server before event loop.
xHttpServerDestroy()closes all connections and frees all resources. - Copy data you need to keep.
xHttpRequestpointers (url,headers,body) are only valid during the handler callback.
Comparison with Other Libraries
| Feature | xhttp server.h | libuv + http-parser | libmicrohttpd | Go net/http | Node.js http |
|---|---|---|---|---|---|
| I/O Model | Async (event loop) | Async (event loop) | Threaded / select | Goroutines | Async (event loop) |
| Event Loop | xEventLoop integration | libuv | Internal | Go runtime | libuv (V8) |
| HTTP Parser | llhttp (H1) + nghttp2 (H2) | http-parser / llhttp | Internal | Internal | llhttp |
| Streaming Response | Built-in (Write/End) | Manual | Manual | Built-in (Flusher) | Built-in (write/end) |
| Routing | Built-in (first match) | None (manual) | None (manual) | Built-in (ServeMux) | None (manual) |
| Keep-Alive | Automatic | Manual | Automatic | Automatic | Automatic |
| Thread Model | Single-threaded | Single-threaded | Multi-threaded | Multi-goroutine | Single-threaded |
| TLS/HTTPS | Built-in (ListenTLS, mTLS) | Manual (libuv + OpenSSL) | Built-in | Built-in (ListenAndServeTLS) | Built-in (https.createServer) |
| Language | C99 | C | C | Go | JavaScript |
Key Differentiator: xhttp server provides a complete, single-threaded HTTP/1.1 & HTTP/2 server with built-in routing, streaming responses, TLS/HTTPS, and automatic keep-alive — all integrated with xEventLoop. HTTP/1.1 and HTTP/2 coexist on the same port via automatic protocol detection (Prior Knowledge for cleartext, ALPN for TLS). Unlike libuv + http-parser (which requires manual response assembly and TLS integration) or libmicrohttpd (which uses threads), xhttp keeps everything on one thread with zero synchronization overhead. The TLS layer supports mutual TLS (mTLS) with client certificate verification, and the streaming API (xHttpResponseWrite/xHttpResponseEnd) makes it straightforward to implement SSE or chunked streaming without external dependencies.
Relationship with Other Modules
- xbase — Uses
xEventLoopfor I/O multiplexing,xSocketfor non-blocking socket management, and socket timeouts for idle connection detection. - xbuf — Uses
xBufferfor request parsing accumulation (URL, headers, body) andxIOBufferfor read/write buffering with scatter-gather I/O. - llhttp — External dependency. Provides incremental HTTP/1.1 request parsing via callbacks, isolated behind the
xHttpProtovtable inproto_h1.c. - nghttp2 — External dependency. Provides HTTP/2 frame processing, HPACK header compression, and stream management, isolated behind the
xHttpProtovtable inproto_h2.c. - OpenSSL / Mbed TLS — External dependency (TLS backend, compile-time selection via
XK_TLS_BACKEND). Provides TLS handshake, encryption, certificate verification, and ALPN negotiation forxHttpServerListenTls().
ws.h — WebSocket Server
Introduction
ws.h provides a callback-driven WebSocket interface integrated with the xhttp server. For pure WebSocket services, call xWsServe() to create a server in one line. For mixed HTTP + WebSocket endpoints, call xWsUpgrade() inside a regular HTTP handler to perform the RFC 6455 upgrade handshake. The library handles frame codec, ping/pong, fragment reassembly, and close negotiation automatically.
All callbacks are dispatched on the event loop thread — no locks or thread pools required.
Design Philosophy
-
Handler-Initiated Upgrade — WebSocket connections start as regular HTTP requests. The user calls
xWsUpgrade()inside anxHttpHandlerFuncto perform the upgrade. This keeps routing unified: WebSocket endpoints are just HTTP routes. -
Callback-Driven I/O — Three optional callbacks (
on_open,on_message,on_close) cover the full connection lifecycle. The library handles all framing, masking, and control frames internally. -
Automatic Protocol Handling — Ping/pong is answered automatically. Fragmented messages are reassembled before delivery. Close handshake follows RFC 6455 §5.5.1 with a 5-second timeout for the peer's response.
-
Connection Hijacking — On successful upgrade, the HTTP connection's socket and transport layer are transferred to a new
xWsConnobject. The HTTP connection is destroyed; the WebSocket connection takes full ownership of the file descriptor. -
Pluggable Crypto Backend — The handshake requires SHA-1 and Base64 for
Sec-WebSocket-Acceptcomputation. The crypto backend is selected at compile time: OpenSSL, Mbed TLS, or a built-in implementation.
Architecture
graph TD
subgraph "Application"
APP["User Code"]
HANDLER["HTTP Handler"]
WS_CBS["xWsCallbacks"]
end
subgraph "xhttp WebSocket"
UPGRADE["xWsUpgrade()"]
HANDSHAKE["Handshake<br/>(RFC 6455 §4)"]
CRYPTO["SHA-1 + Base64<br/>(pluggable backend)"]
WSCONN["xWsConn"]
PARSER["Frame Parser<br/>(incremental)"]
ENCODER["Frame Encoder"]
FRAG["Fragment<br/>Reassembly"]
CTRL["Control Frames<br/>(Ping/Pong/Close)"]
end
subgraph "xbase"
LOOP["xEventLoop"]
SOCK["xSocket"]
TIMER["Idle Timer"]
end
APP -->|"xHttpServerRoute"| HANDLER
HANDLER -->|"xWsUpgrade(w, req, cbs)"| UPGRADE
UPGRADE --> HANDSHAKE
HANDSHAKE --> CRYPTO
HANDSHAKE -->|"101 Switching Protocols"| WSCONN
WSCONN --> PARSER
WSCONN --> ENCODER
PARSER --> FRAG
PARSER --> CTRL
FRAG -->|"on_message"| WS_CBS
CTRL -->|"auto pong"| ENCODER
WSCONN --> SOCK
SOCK --> LOOP
TIMER --> LOOP
style WSCONN fill:#4a90d9,color:#fff
style LOOP fill:#50b86c,color:#fff
style PARSER fill:#9b59b6,color:#fff
style HANDSHAKE fill:#f5a623,color:#fff
Implementation Details
Upgrade Handshake Flow
sequenceDiagram
participant Client as Browser
participant Handler as HTTP Handler
participant Upgrade as xWsUpgrade()
participant Conn as xHttpConn_
participant WS as xWsConn
Client->>Handler: GET /ws (Upgrade: websocket)
Handler->>Upgrade: xWsUpgrade(w, req, &cbs, arg)
Upgrade->>Upgrade: Validate headers
Note over Upgrade: Method=GET<br/>Upgrade: websocket<br/>Connection: Upgrade<br/>Sec-WebSocket-Version: 13<br/>Sec-WebSocket-Key: ...
Upgrade->>Upgrade: SHA1(Key + GUID) → Base64
Upgrade->>Client: 101 Switching Protocols
Upgrade->>Conn: Hijack socket + transport
Upgrade->>WS: xWsConnCreate()
WS->>Client: on_open callback fires
Connection Lifecycle
stateDiagram-v2
[*] --> Open: xWsUpgrade() succeeds
Open --> Open: Data frames (text/binary)
Open --> Open: Ping → auto Pong
Open --> CloseSent: xWsClose() called
Open --> CloseReceived: Peer sends Close
CloseSent --> Closed: Peer Close received
CloseSent --> Closed: 5s timeout
CloseReceived --> Closed: Echo Close flushed
Open --> Closed: I/O error
Open --> CloseSent: Idle timeout (1001)
Closed --> [*]: on_close + destroy
Frame Processing
When data arrives on the socket, the incremental frame parser (xWsFrameParser) extracts complete frames from the xIOBuffer. Each frame is processed based on its opcode:
| Opcode | Handling |
|---|---|
| Text (0x1) | Deliver via on_message |
| Binary (0x2) | Deliver via on_message |
| Continuation (0x0) | Append to fragment buffer |
| Ping (0x9) | Auto-reply with Pong |
| Pong (0xA) | Ignored |
| Close (0x8) | Close handshake |
Fragment Reassembly
Fragmented messages are reassembled transparently:
- First fragment (FIN=0, opcode=Text/Binary) starts accumulation in
frag_buf. - Continuation frames (opcode=0x0) append to
frag_buf. - Final fragment (FIN=1, opcode=0x0) triggers reassembly and delivers the complete message via
on_message.
Protocol violations (e.g., new message mid-fragment) result in a Close frame with status 1002.
Close State Machine
XDEF_ENUM(xWsCloseState){
xWsCloseState_Open, // Normal operating state
xWsCloseState_CloseSent, // We sent Close, waiting for peer
xWsCloseState_CloseReceived, // Peer sent Close, we replied
xWsCloseState_Closed, // Connection fully closed
};
- Server-initiated close:
xWsClose()sends a Close frame and transitions toCLOSE_SENT. A 5-second timer waits for the peer's Close response. - Peer-initiated close: The peer's Close frame is echoed back, transitioning to
CLOSE_RECEIVED. After the echo is flushed,on_closefires and the connection is destroyed. - Idle timeout: After the configured idle period with no data, a Close frame with code 1001 (Going Away) is sent.
Internal File Structure
| File | Role |
|---|---|
ws.h | Public API (types, callbacks, functions) |
ws.c | Connection lifecycle, I/O, frame dispatch |
ws_handshake_server.c | Server upgrade handshake (RFC 6455 §4.2) |
ws_frame.h/c | Frame codec (parse + encode) |
ws_crypto.h | SHA-1 + Base64 interface |
ws_crypto_openssl.c | OpenSSL backend |
ws_crypto_mbedtls.c | Mbed TLS backend |
ws_crypto_builtin.c | Built-in (no TLS dep) |
ws_serve.c | xWsServe() convenience wrapper |
ws_private.h | Internal data structures |
API Reference
Types
| Type | Description |
|---|---|
xWsConn | Opaque WebSocket connection handle |
xWsOpcode | Message type: Text (0x1), Binary (0x2) |
xWsCallbacks | Struct of 3 optional callback pointers |
Callback Signatures
xWsOnOpenFunc
typedef void (*xWsOnOpenFunc)(xWsConn conn, void *arg);
Called when the WebSocket connection is established. conn is valid until on_close returns.
xWsOnMessageFunc
typedef void (*xWsOnMessageFunc)(
xWsConn conn, xWsOpcode opcode,
const void *payload, size_t len,
void *arg);
Called when a complete message is received. Fragmented messages are reassembled before delivery. payload is valid only during the callback.
xWsOnCloseFunc
typedef void (*xWsOnCloseFunc)(
xWsConn conn, uint16_t code,
const char *reason, size_t len,
void *arg);
Called when the connection is closed (clean or abnormal). After this callback returns, conn is invalid.
xWsCallbacks
typedef struct {
xWsOnOpenFunc on_open; // optional
xWsOnMessageFunc on_message; // optional
xWsOnCloseFunc on_close; // optional
} xWsCallbacks;
Functions
| Function | Description |
|---|---|
xWsServe | One-call WebSocket-only server |
xWsUpgrade | Upgrade HTTP → WebSocket |
xWsSend | Send a text or binary message |
xWsClose | Initiate graceful close |
xWsServe
xHttpServer xWsServe(
xEventLoop loop,
const char *host,
uint16_t port,
const xWsCallbacks *callbacks,
void *arg);
Convenience function that creates an HTTP server, registers a catch-all route that upgrades every incoming request to WebSocket, and starts listening. Returns the server handle for later cleanup via xHttpServerDestroy(), or NULL on failure.
Parameters:
loop— Event loop (must not be NULL).host— Bind address (e.g."0.0.0.0"), or NULL.port— Port number to listen on.callbacks— WebSocket event callbacks (not NULL).arg— User argument forwarded to all callbacks.
Returns: Server handle, or NULL on failure.
xWsUpgrade
xErrno xWsUpgrade(
xHttpResponseWriter writer,
const xHttpRequest *req,
const xWsCallbacks *callbacks,
void *arg);
Call inside an xHttpHandlerFunc to upgrade the HTTP connection to WebSocket. On success, the handler must return immediately — the HTTP connection has been hijacked.
On failure (bad headers, wrong method), an HTTP error response (400/405) is sent automatically and a non-Ok error code is returned.
Parameters:
writer— Response writer from the handler.req— HTTP request from the handler.callbacks— WebSocket event callbacks (not NULL).arg— User argument forwarded to all callbacks.
Returns: xErrno_Ok on success.
xWsSend
xErrno xWsSend(
xWsConn conn, xWsOpcode opcode,
const void *payload, size_t len);
Send a message over the WebSocket connection. The payload is framed and queued for asynchronous transmission.
Parameters:
conn— WebSocket connection handle.opcode—xWsOpcode_TextorxWsOpcode_Binary.payload— Message data.len— Payload length in bytes.
Returns: xErrno_Ok on success, xErrno_InvalidState if the connection is closing.
xWsClose
xErrno xWsClose(xWsConn conn, uint16_t code);
Initiate a graceful close. Sends a Close frame with the given status code. The connection remains open until the peer responds or a 5-second timeout expires.
Parameters:
conn— WebSocket connection handle.code— Close status code (e.g., 1000 for normal).
Returns: xErrno_Ok on success.
Close Status Codes
| Code | Constant | Meaning |
|---|---|---|
| 1000 | XWS_CLOSE_NORMAL | Normal closure |
| 1001 | XWS_CLOSE_GOING_AWAY | Server shutting down |
| 1002 | XWS_CLOSE_PROTOCOL_ERR | Protocol error |
| 1003 | XWS_CLOSE_UNSUPPORTED | Unsupported data |
| 1005 | XWS_CLOSE_NO_STATUS | No status received |
| 1006 | XWS_CLOSE_ABNORMAL | Abnormal closure |
Usage Examples
Echo Server (with xWsServe)
#include <xbase/event.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>
static void on_open(xWsConn conn, void *arg) {
(void)arg;
const char *hi = "Welcome!";
xWsSend(conn, xWsOpcode_Text, hi, strlen(hi));
}
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
(void)arg;
xWsSend(conn, op, data, len);
}
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
(void)conn; (void)reason; (void)len; (void)arg;
printf("closed: %u\n", code);
}
static const xWsCallbacks ws_cbs = {
.on_open = on_open,
.on_message = on_message,
.on_close = on_close,
};
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer srv = xWsServe(loop, "0.0.0.0", 8080, &ws_cbs, NULL);
if (!srv) return 1;
printf("ws://localhost:8080/\n");
xEventLoopRun(loop);
xHttpServerDestroy(srv);
xEventLoopDestroy(loop);
return 0;
}
Echo Server (with xWsUpgrade)
#include <xbase/event.h>
#include <xhttp/server.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>
static const xWsCallbacks ws_cbs = { ... };
static void ws_handler(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)arg;
xWsUpgrade(w, req, &ws_cbs, NULL);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer srv = xHttpServerCreate(loop);
xHttpServerRoute(srv, "GET /ws", ws_handler, NULL);
xHttpServerListen(srv, "0.0.0.0", 8080);
printf("ws://localhost:8080/ws\n");
xEventLoopRun(loop);
xHttpServerDestroy(srv);
xEventLoopDestroy(loop);
return 0;
}
Per-Connection User Data
typedef struct {
char username[64];
int msg_count;
} Session;
static void on_open(xWsConn conn, void *arg) {
Session *s = (Session *)arg;
snprintf(s->username, sizeof(s->username), "user_%p", (void *)conn);
s->msg_count = 0;
}
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
Session *s = (Session *)arg;
s->msg_count++;
printf("[%s] msg #%d: %.*s\n", s->username, s->msg_count, (int)len, (const char *)data);
xWsSend(conn, op, data, len);
}
static void ws_handler(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)arg;
Session *s = calloc(1, sizeof(Session));
xWsCallbacks cbs = {
.on_open = on_open,
.on_message = on_message,
.on_close = on_close_free_session,
};
xWsUpgrade(w, req, &cbs, s);
}
Graceful Server-Initiated Close
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
(void)op; (void)arg;
if (len == 4 && memcmp(data, "quit", 4) == 0) {
xWsClose(conn, 1000); // normal close
return;
}
xWsSend(conn, op, data, len);
}
JavaScript Client
<script>
const ws = new WebSocket('ws://localhost:8080/ws');
ws.onopen = () => console.log('connected');
ws.onmessage = (e) => console.log('< ' + e.data);
ws.onclose = (e) =>
console.log('closed: ' + e.code);
// Send a message
ws.send('Hello, server!');
</script>
Best Practices
- Return immediately after
xWsUpgrade(). On success, the HTTP connection is hijacked. Do not call anyxHttpResponse*functions afterward. - Don't block in callbacks. All callbacks run on the event loop thread. Blocking delays all other I/O.
- Copy payload if needed. The
payloadpointer inon_messageis valid only during the callback. Copy the data if you need it later. - Use
xWsClose()for graceful shutdown. Avoid dropping connections without a Close handshake. - Handle
on_closefor cleanup. Free per-connection resources inon_close, as thexWsConnhandle becomes invalid after the callback returns. - Idle timeout is inherited. The WebSocket connection inherits the HTTP server's
idle_timeout_mssetting. Adjust it viaxHttpServerSetIdleTimeout()if needed.
Comparison with Other Libraries
| Feature | xhttp WS | libwebsockets | uWebSockets |
|---|---|---|---|
| Integration | xEventLoop | Own loop | Own loop |
| Upgrade | In HTTP handler | Separate | Separate |
| Fragment reassembly | Automatic | Automatic | Automatic |
| Ping/Pong | Automatic | Automatic | Automatic |
| Close handshake | RFC 6455 | RFC 6455 | RFC 6455 |
| TLS | Via xhttp | Built-in | Built-in |
| Language | C99 | C | C++ |
| Dependencies | xbase only | OpenSSL | None |
Key Differentiator: xhttp's WebSocket server is unique in its handler-initiated upgrade pattern. Instead of a separate WebSocket server, you register a normal HTTP route and call xWsUpgrade() inside the handler. This keeps routing, middleware, and mixed HTTP+WS endpoints unified under a single server instance.
ws.h — WebSocket Client
Introduction
ws.h provides xWsConnect(), an asynchronous WebSocket client that integrates with xbase's event loop. The entire connection process — DNS resolution, TCP connect, optional TLS handshake, and HTTP Upgrade — runs fully asynchronously. Once connected, the same callback-driven model (on_open, on_message, on_close) and the same xWsConn handle are used for both client and server connections.
Design Philosophy
-
Fully Asynchronous Connection —
xWsConnect()returns immediately. The multi-phase connection process (DNS → TCP → TLS → HTTP Upgrade) is driven entirely by the event loop. No threads or blocking calls. -
Shared Connection Model — Once the handshake completes, a client
xWsConnis identical to a serverxWsConn. The samexWsSend(),xWsClose(), and callback interfaces apply. Code that operates onxWsConndoesn't need to know which side initiated the connection. -
Failure via
on_close— If the connection fails at any stage (DNS, TCP, TLS, or HTTP Upgrade),on_closeis invoked with an error code.on_openis never called for failed connections. This simplifies error handling: cleanup always happens in one place. -
Client-Side Masking — Per RFC 6455, client-to-server frames must be masked. The library handles this automatically when the connection is created in client mode.
Architecture
graph TD
subgraph "Application"
APP["User Code"]
CBS["xWsCallbacks"]
CONF["xWsConnectConf"]
end
subgraph "xWsConnect State Machine"
CONNECT["xWsConnect()"]
DNS["DNS Resolution"]
TCP["TCP Connect"]
TLS["TLS Handshake<br/>(wss:// only)"]
UPGRADE["HTTP Upgrade<br/>Request/Response"]
VALIDATE["Validate 101<br/>+ Sec-WebSocket-Accept"]
end
subgraph "Established Connection"
WSCONN["xWsConn<br/>(client mode)"]
SEND["xWsSend()"]
CLOSE["xWsClose()"]
end
subgraph "xbase"
LOOP["xEventLoop"]
SOCK["xSocket"]
TIMER["Timeout Timer"]
end
APP --> CONF
APP --> CBS
CONF --> CONNECT
CBS --> CONNECT
CONNECT --> DNS
DNS --> TCP
TCP --> TLS
TLS --> UPGRADE
UPGRADE --> VALIDATE
VALIDATE -->|"Success"| WSCONN
VALIDATE -->|"Failure"| CBS
WSCONN --> SEND
WSCONN --> CLOSE
WSCONN --> SOCK
SOCK --> LOOP
TIMER --> LOOP
style WSCONN fill:#4a90d9,color:#fff
style LOOP fill:#50b86c,color:#fff
style CONNECT fill:#f5a623,color:#fff
style VALIDATE fill:#9b59b6,color:#fff
Implementation Details
Connection State Machine
The xWsConnector drives the connection through five phases, all on the event loop thread:
stateDiagram-v2
[*] --> DNS: xWsConnect() called
DNS --> TCP_CONNECT: Address resolved
TCP_CONNECT --> TLS_HANDSHAKE: Connected [wss]
TCP_CONNECT --> HTTP_UPGRADE_WRITE: Connected [ws]
TLS_HANDSHAKE --> HTTP_UPGRADE_WRITE: Handshake complete
HTTP_UPGRADE_WRITE --> HTTP_UPGRADE_READ: Request sent
HTTP_UPGRADE_READ --> DONE: 101 validated
DONE --> [*]: on_open fires
DNS --> [*]: Failure → on_close
TCP_CONNECT --> [*]: Failure → on_close
TLS_HANDSHAKE --> [*]: Failure → on_close
HTTP_UPGRADE_READ --> [*]: Bad response → on_close
DNS --> [*]: Timeout → on_close
TCP_CONNECT --> [*]: Timeout → on_close
Phase Details
| Phase | What Happens |
|---|---|
| DNS | xDnsResolve() resolves the hostname asynchronously. On success, proceeds to TCP. |
| TCP Connect | Creates an xSocket, calls connect(). Waits for the writable event (EINPROGRESS). |
| TLS Handshake | For wss:// URLs only. Initializes the TLS transport and drives the handshake via read/write events. |
| HTTP Upgrade Write | Builds the Upgrade request (with random Sec-WebSocket-Key) and flushes it to the server. |
| HTTP Upgrade Read | Reads the server's response, validates HTTP/1.1 101, Upgrade: websocket, Connection: Upgrade, and Sec-WebSocket-Accept. |
Handshake Flow
sequenceDiagram
participant App as Application
participant Conn as xWsConnector
participant DNS as xDnsResolve
participant Server as Remote Server
App->>Conn: xWsConnect(loop, conf, cbs, arg)
Conn->>DNS: Resolve hostname
DNS-->>Conn: Address resolved
Conn->>Server: TCP connect()
Server-->>Conn: Connected
Note over Conn,Server: (wss:// only) TLS handshake
Conn->>Server: GET /path HTTP/1.1<br/>Upgrade: websocket<br/>Sec-WebSocket-Key: ...
Server-->>Conn: HTTP/1.1 101 Switching Protocols<br/>Sec-WebSocket-Accept: ...
Conn->>Conn: Validate response
Conn->>App: on_open(conn, arg)
Timeout Handling
A configurable timeout (default 10 seconds) covers the entire connection process. If any phase takes too long, the timer fires, the connector is destroyed, and on_close is invoked with code 1006 (Abnormal Closure).
Internal File Structure
| File | Role |
|---|---|
ws.h | Public API (xWsConnect, xWsConnectConf) |
ws_connect.c | Async connection state machine |
ws_handshake_client.h/c | Build Upgrade request, validate 101 response |
ws_crypto.h | SHA-1 + Base64 for Sec-WebSocket-Accept |
transport_tls_client.h | TLS client transport init (shared xTlsCtx → per-connection SSL) |
transport_tls_client_openssl.c | OpenSSL TLS client transport implementation |
transport_tls_client_mbedtls.c | mbedTLS TLS client transport implementation |
API Reference
Types
| Type | Description |
|---|---|
xWsConn | Opaque WebSocket connection handle (shared with server) |
xWsOpcode | Message type: Text (0x1), Binary (0x2) |
xWsCallbacks | Struct of 3 optional callback pointers (shared with server) |
xWsConnectConf | Configuration for xWsConnect() |
xWsConnectConf
struct xWsConnectConf {
const char *url; // ws:// or wss:// URL (required)
const xTlsConf *tls; // TLS config for wss:// (NULL = defaults)
xTlsCtx tls_ctx; // Pre-created shared TLS context (priority over tls)
const char *headers; // Extra HTTP headers (NULL = none)
int timeout_ms; // Connect timeout (0 = 10000 ms)
};
| Field | Description |
|---|---|
url | WebSocket URL. Must start with ws:// or wss://. Required. |
tls | TLS configuration for wss:// connections. NULL uses system CA with verification enabled. Ignored for ws://. Ignored when tls_ctx is set. |
tls_ctx | Pre-created shared TLS context from xTlsCtxCreate(). Takes priority over tls. The caller retains ownership and must keep it alive for the lifetime of the connection. NULL = create from tls (or use defaults). |
headers | Extra HTTP headers appended to the Upgrade request. Format: "Key: Value\r\nKey2: Value2\r\n". NULL for none. |
timeout_ms | Timeout for the entire connection process in milliseconds. 0 uses the default (10000 ms). |
Callbacks
The same xWsCallbacks struct is used for both client and server connections. See WebSocket Server for callback signature details.
Client-specific behavior:
on_open— Called when the connection is fully established (101 validated). Not called on failure.on_close— Called on connection failure (DNS, TCP, TLS, or Upgrade error) or after a normal close. For failed connections,connisNULL.
Functions
xWsConnect
xErrno xWsConnect(
xEventLoop loop,
const xWsConnectConf *conf,
const xWsCallbacks *callbacks,
void *arg);
Initiate an asynchronous WebSocket client connection. Returns immediately; the connection process runs on the event loop.
Parameters:
loop— Event loop (must not be NULL).conf— Connection configuration (must not be NULL,conf->urlrequired).callbacks— WebSocket event callbacks (must not be NULL).arg— User argument forwarded to all callbacks.
Returns: xErrno_Ok if the async connection started, xErrno_InvalidArg for bad parameters (NULL pointers, invalid URL scheme).
xWsSend
xErrno xWsSend(
xWsConn conn, xWsOpcode opcode,
const void *payload, size_t len);
Send a message. Identical to the server-side API. Client frames are automatically masked per RFC 6455.
xWsClose
xErrno xWsClose(xWsConn conn, uint16_t code);
Initiate a graceful close. Identical to the server-side API.
Usage Examples
Connect and Echo
#include <xbase/event.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>
static void on_open(xWsConn conn, void *arg) {
(void)arg;
const char *msg = "Hello, server!";
xWsSend(conn, xWsOpcode_Text, msg, strlen(msg));
}
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
(void)conn; (void)op; (void)arg;
printf("Received: %.*s\n", (int)len, (const char *)data);
xWsClose(conn, 1000);
}
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
(void)conn; (void)reason; (void)len; (void)arg;
printf("Closed: %u\n", code);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xWsConnectConf conf = {0};
conf.url = "ws://localhost:8080/ws";
xWsCallbacks cbs = {
.on_open = on_open,
.on_message = on_message,
.on_close = on_close,
};
xWsConnect(loop, &conf, &cbs, NULL);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Secure Connection (wss://)
#include <xbase/event.h>
#include <xhttp/ws.h>
#include <xnet/tls.h>
static void on_open(xWsConn conn, void *arg) { /* ... */ }
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) { /* ... */ }
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) { /* ... */ }
int main(void) {
xEventLoop loop = xEventLoopCreate();
// Skip certificate verification (dev only)
xTlsConf tls = {0};
tls.skip_verify = 1;
xWsConnectConf conf = {0};
conf.url = "wss://echo.example.com/ws";
conf.tls = &tls;
conf.timeout_ms = 5000;
xWsCallbacks cbs = {
.on_open = on_open,
.on_message = on_message,
.on_close = on_close,
};
xWsConnect(loop, &conf, &cbs, NULL);
xEventLoopRun(loop);
xEventLoopDestroy(loop);
return 0;
}
Shared TLS Context (Multiple Connections)
When creating many wss:// connections (e.g. reconnect loops or connection pools), use a shared xTlsCtx to avoid reloading certificates on every connection:
#include <xbase/event.h>
#include <xhttp/ws.h>
#include <xnet/tls.h>
static void on_open(xWsConn conn, void *arg) { /* ... */ }
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) { /* ... */ }
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) { /* ... */ }
int main(void) {
xEventLoop loop = xEventLoopCreate();
// Create a shared TLS context once
xTlsConf tls = {0};
tls.ca = "ca.pem";
xTlsCtx ctx = xTlsCtxCreate(&tls);
// All connections share the same ctx
xWsConnectConf conf = {0};
conf.url = "wss://echo.example.com/ws";
conf.tls_ctx = ctx; // shared, not copied
xWsCallbacks cbs = {
.on_open = on_open,
.on_message = on_message,
.on_close = on_close,
};
xWsConnect(loop, &conf, &cbs, NULL);
xEventLoopRun(loop);
// Destroy ctx after all connections are closed
xTlsCtxDestroy(ctx);
xEventLoopDestroy(loop);
return 0;
}
Custom Headers (Authentication)
xWsConnectConf conf = {0};
conf.url = "ws://api.example.com/stream";
conf.headers = "Authorization: Bearer token123\r\n"
"X-Client-Version: 1.0\r\n";
xWsConnect(loop, &conf, &cbs, NULL);
Connection Failure Handling
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
if (conn == NULL) {
// Connection failed before establishing WebSocket
printf("Connection failed (code %u)\n", code);
// Optionally retry after a delay
return;
}
// Normal close after successful connection
printf("Disconnected: %u\n", code);
}
Binary Data
static void on_open(xWsConn conn, void *arg) {
uint8_t data[] = {0x00, 0x01, 0x02, 0xFF, 0xFE};
xWsSend(conn, xWsOpcode_Binary, data, sizeof(data));
}
Best Practices
- Check the return value of
xWsConnect(). It returnsxErrno_InvalidArgfor obviously bad parameters (NULL pointers, unsupported URL scheme). Network errors are reported asynchronously viaon_close. - Handle
conn == NULLinon_close. This indicates a connection failure before the WebSocket was established. Use this to implement retry logic. - Don't block in callbacks. All callbacks run on the event loop thread.
- Copy payload if needed. The
payloadpointer inon_messageis valid only during the callback. - Use
xWsClose()for graceful shutdown. The client sends a Close frame and waits for the server's response. - Set a reasonable timeout. The default 10-second timeout covers DNS + TCP + TLS + Upgrade. Adjust via
conf.timeout_msfor high-latency networks. - Never use
skip_verifyin production. It disables all certificate validation. Use a proper CA path or system CA bundle instead.
Comparison with Other Libraries
| Feature | xhttp WS Client | libwebsockets | wslay | civetweb |
|---|---|---|---|---|
| I/O Model | Async (event loop) | Async (own loop) | Sync (user drives) | Threaded |
| Event Loop | xEventLoop | Own loop | None | pthreads |
| DNS | Async (xDnsResolve) | Async (built-in) | Manual | Blocking |
| TLS | Via xnet | Built-in | Manual | Built-in |
| Client Masking | Automatic | Automatic | Automatic | Automatic |
| Connection Timeout | Configurable | Configurable | Manual | Configurable |
| Language | C99 | C | C | C |
| Dependencies | xbase + xnet | OpenSSL | None | None |
Key Differentiator: xhttp's WebSocket client runs entirely on the xbase event loop with zero blocking calls. The multi-phase connection (DNS → TCP → TLS → Upgrade) is a single async state machine. Combined with the shared xWsConn model, client and server code use identical APIs for sending, receiving, and closing — making bidirectional WebSocket applications straightforward.
TLS Context Sharing: For wss:// connections, the client supports a shared xTlsCtx (via conf.tls_ctx) that avoids reloading certificates and re-creating the SSL context on every connection. This is the same pattern used by xTcpConnect and xTcpListener, providing consistent TLS context management across all xKit networking APIs.
sse.c — SSE Stream Client
Introduction
sse.c implements Server-Sent Events (SSE) support for xHttpClient. It provides xHttpClientGetSse() and xHttpClientDoSse() which subscribe to SSE endpoints and parse the event stream according to the W3C SSE specification. Each parsed event is delivered to a callback as it arrives, enabling real-time streaming — ideal for LLM API integration.
Design Philosophy
-
W3C Spec Compliance — The parser follows the W3C Server-Sent Events specification: field parsing (event, data, id, retry), comment handling, multi-line data joining with
\n, and default event type "message". -
Streaming Parse — Data is parsed incrementally as it arrives from libcurl's write callback. Complete lines are processed immediately; incomplete lines are buffered until more data arrives.
-
Shared Infrastructure — SSE requests reuse the same
curl_multihandle and event loop integration as regular HTTP requests. ThexHttpReqVtablemechanism allows SSE to plug in its own write callback and completion handler. -
User-Controlled Cancellation — The
xSseEventFunccallback returns anint: 0 to continue, non-zero to close the connection. This gives the user fine-grained control over when to stop streaming.
Architecture
graph TD
subgraph "SSE Request Flow"
SUBMIT["xHttpClientDoSse()"]
EASY["curl_easy + SSE headers"]
WRITE["sse_write_callback"]
PARSER["xSseParser_"]
EVENT["on_event(ev)"]
DONE["on_done(curl_code)"]
end
subgraph "Shared with Oneshot"
MULTI["curl_multi"]
LOOP["xEventLoop"]
CHECK["check_multi_info()"]
end
SUBMIT --> EASY
EASY --> MULTI
MULTI --> LOOP
LOOP -->|"fd ready"| WRITE
WRITE --> PARSER
PARSER -->|"event boundary"| EVENT
CHECK -->|"transfer done"| DONE
style PARSER fill:#4a90d9,color:#fff
style EVENT fill:#50b86c,color:#fff
Implementation Details
SSE Parser State Machine
stateDiagram-v2
[*] --> Buffering: Data arrives from curl
Buffering --> ParseLine: Complete line found (\\n or \\r\\n)
ParseLine --> FieldParse: Non-empty line
ParseLine --> DispatchEvent: Empty line (event boundary)
FieldParse --> Buffering: Continue parsing
DispatchEvent --> CallUser: data field exists
DispatchEvent --> Buffering: No data (skip)
CallUser --> Buffering: User returns 0 (continue)
CallUser --> [*]: User returns non-zero (close)
SSE Field Parsing
Each non-empty line is parsed as a field:
| Line Format | Field | Value |
|---|---|---|
:comment | (ignored) | — |
event:type | event_type | "type" |
data:payload | data | "payload" (accumulated with \n) |
id:123 | id | "123" (persists across events) |
retry:5000 | retry | 5000 (ms, must be all digits) |
unknown:foo | (ignored) | — |
Multi-line data: Multiple data: lines are joined with \n:
data:line1
data:line2
data:line3
→ ev.data = "line1\nline2\nline3"
Parser Internal Structure
struct xSseParser_ {
xBuffer buf; // Raw incoming data buffer
size_t pos; // Parse position within buf
int error; // Allocation failure flag
char *event_type; // Current event type (NULL = "message")
char *data; // Accumulated data lines
char *id; // Last event ID (persists across events)
int retry; // Retry delay in ms (-1 = not set)
};
Data Flow
sequenceDiagram
participant Server as SSE Server
participant Curl as libcurl
participant Writer as sse_write_callback
participant Parser as xSseParser_
participant User as User Callback
Server->>Curl: HTTP 200 text/event-stream
loop For each chunk
Curl->>Writer: sse_write_callback(chunk)
Writer->>Parser: sse_parser_feed(chunk)
Parser->>Parser: Buffer + parse lines
alt Empty line (event boundary)
Parser->>User: on_event(ev)
alt User returns 0
User->>Parser: Continue
else User returns non-zero
User->>Writer: Close connection
Writer->>Curl: Return 0 (abort)
end
end
end
Curl->>User: on_done(curl_code)
SSE Request Structure
struct xSseReq_ {
struct xHttpReq_ base; // Base request (shared with oneshot)
xSseEventFunc on_event; // Per-event callback
xSseDoneFunc on_done; // Stream-end callback
struct xSseParser_ parser; // SSE parser state
struct curl_slist *sse_headers; // Accept: text/event-stream + user headers
};
The SSE request uses a dedicated vtable:
sse_on_done— Invokes the user'son_donecallback.sse_on_cleanup— Frees SSE-specific resources (parser, headers).
Automatic Headers
xHttpClientDoSse() automatically adds:
Accept: text/event-streamCache-Control: no-cache
User-provided headers are merged after these defaults.
API Reference
Types
| Type | Description |
|---|---|
xSseEvent | SSE event: event (type), data, id, retry |
xSseEventFunc | int (*)(const xSseEvent *ev, void *arg) — return 0 to continue, non-zero to close |
xSseDoneFunc | void (*)(int curl_code, void *arg) — called when stream ends |
xSseEvent Fields
| Field | Type | Description |
|---|---|---|
event | const char * | Event type. "message" if omitted by server. |
data | const char * | Event data. Multi-line data joined by \n. |
id | const char * | Last event ID, or NULL. |
retry | int | Retry delay in ms, or -1 if not set. |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xHttpClientGetSse | xErrno xHttpClientGetSse(xHttpClient client, const char *url, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg) | Subscribe to SSE endpoint (GET). | Not thread-safe |
xHttpClientDoSse | xErrno xHttpClientDoSse(xHttpClient client, const xHttpRequestConf *config, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg) | Fully-configured SSE request. | Not thread-safe |
Usage Examples
Simple SSE Subscription
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>
static int on_event(const xSseEvent *ev, void *arg) {
(void)arg;
printf("[%s] %s\n", ev->event, ev->data);
return 0; // Continue receiving
}
static void on_done(int curl_code, void *arg) {
(void)arg;
printf("Stream ended (code=%d)\n", curl_code);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpClient client = xHttpClientCreate(loop, NULL);
xHttpClientGetSse(client, "https://example.com/events",
on_event, on_done, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
LLM API Streaming (OpenAI-Compatible)
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/client.h>
static int on_event(const xSseEvent *ev, void *arg) {
(void)arg;
// OpenAI sends "[DONE]" as the final data
if (strcmp(ev->data, "[DONE]") == 0) {
printf("\n--- Stream complete ---\n");
return 1; // Close connection
}
// Parse JSON and extract content delta...
printf("%s", ev->data);
fflush(stdout);
return 0;
}
static void on_done(int curl_code, void *arg) {
(void)arg;
if (curl_code != 0)
printf("\nStream error (code=%d)\n", curl_code);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpClient client = xHttpClientCreate(loop, NULL);
const char *body =
"{"
" \"model\": \"gpt-4\","
" \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}],"
" \"stream\": true"
"}";
const char *headers[] = {
"Content-Type: application/json",
"Authorization: Bearer sk-your-api-key",
NULL
};
xHttpRequestConf config = {
.url = "https://api.openai.com/v1/chat/completions",
.method = xHttpMethod_POST,
.body = body,
.body_len = strlen(body),
.headers = headers,
.timeout_ms = 60000, // 60s timeout for streaming
};
xHttpClientDoSse(client, &config, on_event, on_done, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
Early Cancellation
static int on_event(const xSseEvent *ev, void *arg) {
int *count = (int *)arg;
(*count)++;
printf("Event #%d: %s\n", *count, ev->data);
// Stop after 10 events
if (*count >= 10) {
printf("Received enough events, closing.\n");
return 1; // Non-zero = close connection
}
return 0;
}
Use Cases
-
LLM API Integration — Stream responses from OpenAI, Anthropic, Google Gemini, or any OpenAI-compatible API. Use
xHttpClientDoSse()with POST method and JSON body. -
Real-Time Notifications — Subscribe to server push notifications (chat messages, stock prices, IoT sensor data) via SSE endpoints.
-
Log Streaming — Tail remote log streams delivered as SSE events.
Best Practices
- Use
xHttpClientDoSse()for LLM APIs. Most LLM APIs require POST with a JSON body and custom headers.GetSseis only for simple GET endpoints. - Handle
[DONE]signals. Many LLM APIs send a special[DONE]data payload to signal the end of the stream. Return non-zero fromon_eventto close cleanly. - Set appropriate timeouts. Streaming responses can take a long time. Set
timeout_mshigh enough (e.g., 60000ms) to avoid premature timeouts. - Don't block in
on_event. The callback runs on the event loop thread. Blocking delays all other I/O. - Copy event data if needed.
xSseEventpointers are valid only during the callback.
Comparison with Other Libraries
| Feature | xhttp SSE | eventsource (JS) | sseclient-py | libcurl (manual) |
|---|---|---|---|---|
| Spec Compliance | W3C SSE | W3C SSE | W3C SSE | Manual parsing |
| Integration | xEventLoop (async) | Browser event loop | Blocking iterator | Manual |
| POST Support | Yes (DoSse) | No (GET only) | No (GET only) | Manual |
| Cancellation | Callback return value | close() | Break loop | curl_easy_pause |
| Multi-line Data | Auto-joined with \n | Auto-joined | Auto-joined | Manual |
| Language | C99 | JavaScript | Python | C |
Key Differentiator: xhttp's SSE implementation is unique in supporting POST-based SSE (via xHttpClientDoSse), which is essential for LLM API integration. Most SSE libraries only support GET. The incremental parser integrates seamlessly with the event loop, delivering events as they arrive without buffering the entire stream.
TLS Deployment Guide
This guide covers end-to-end TLS deployment for xhttp, including certificate generation, server and client configuration, and mutual TLS (mTLS). For API reference, see server.md and client.md.
Prerequisites
- OpenSSL CLI — Used for certificate generation (
opensslcommand). - TLS backend compiled — xKit must be built with
XK_TLS_BACKEND=openssl(ormbedtls). Without a TLS backend,xHttpServerListenTls()returnsxErrno_NotSupported.
Check your build:
# If XK_HAS_OPENSSL is defined, TLS is available
grep -r "XK_HAS_OPENSSL" xhttp/
Certificate Generation
Self-Signed Certificate (Development)
For quick local development and testing:
openssl req -x509 -newkey rsa:2048 \
-keyout server-key.pem \
-out server.pem \
-days 365 -nodes \
-subj '/CN=localhost'
This produces:
server.pem— Self-signed certificateserver-key.pem— Unencrypted private key
Note: Self-signed certificates are not trusted by default. Clients must either set
skip_verify = 1or provide the certificate as a CA viaca.
CA-Signed Certificates (Production / mTLS)
For mutual TLS or production-like setups, create a private CA and sign both server and client certificates.
Step 1: Create a CA
# Generate CA private key and self-signed certificate
openssl req -x509 -newkey rsa:2048 \
-keyout ca-key.pem \
-out ca.pem \
-days 365 -nodes \
-subj '/CN=MyCA'
Step 2: Generate Server Certificate
# Generate server key + CSR
openssl req -newkey rsa:2048 \
-keyout server-key.pem \
-out server.csr \
-nodes \
-subj '/CN=localhost'
# Sign with CA
openssl x509 -req \
-in server.csr \
-CA ca.pem -CAkey ca-key.pem -CAcreateserial \
-out server.pem \
-days 365
# Clean up CSR
rm server.csr
Step 3: Generate Client Certificate (for mTLS)
# Generate client key + CSR
openssl req -newkey rsa:2048 \
-keyout client-key.pem \
-out client.csr \
-nodes \
-subj '/CN=MyClient'
# Sign with the same CA
openssl x509 -req \
-in client.csr \
-CA ca.pem -CAkey ca-key.pem -CAcreateserial \
-out client.pem \
-days 365
# Clean up CSR
rm client.csr
After these steps you have:
| File | Description |
|---|---|
ca.pem | CA certificate (trusted by both sides) |
ca-key.pem | CA private key (keep secure, not deployed) |
server.pem | Server certificate (signed by CA) |
server-key.pem | Server private key |
client.pem | Client certificate (signed by CA) |
client-key.pem | Client private key |
Deployment Scenarios
1. One-Way TLS (Server Authentication Only)
The most common setup: the client verifies the server's identity, but the server does not verify the client.
sequenceDiagram
participant Client
participant Server
Client->>Server: TLS ClientHello
Server->>Client: Certificate (server.pem)
Client->>Client: Verify server cert against CA
Client->>Server: Finished
Server->>Client: Finished
Note over Client,Server: Encrypted HTTP traffic
Server:
xTlsConf tls = {
.cert = "server.pem",
.key = "server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
Client (with CA verification):
xTlsConf tls = {0};
tls.ca = "ca.pem";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
xHttpClientCreate(loop, &conf);
xHttpClientGet(
client,
"https://localhost:8443/hello",
on_response, NULL);
Client (skip verification — development only):
xTlsConf tls = {0};
tls.skip_verify = 1;
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
xHttpClientCreate(loop, &conf);
2. Mutual TLS (mTLS)
Both sides authenticate each other. The server requires a valid client certificate signed by a trusted CA.
sequenceDiagram
participant Client
participant Server
Client->>Server: TLS ClientHello
Server->>Client: Certificate (server.pem) + CertificateRequest
Client->>Client: Verify server cert against CA
Client->>Server: Certificate (client.pem)
Server->>Server: Verify client cert against CA
Client->>Server: Finished
Server->>Client: Finished
Note over Client,Server: Mutually authenticated encrypted traffic
Server:
xTlsConf tls = {
.cert = "server.pem",
.key = "server-key.pem",
.ca = "ca.pem", // CA to verify client certs
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
Client:
xTlsConf tls = {0};
tls.ca = "ca.pem";
tls.cert = "client.pem";
tls.key = "client-key.pem";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
xHttpClientCreate(loop, &conf);
xHttpClientGet(
client,
"https://localhost:8443/secure",
on_response, NULL);
3. HTTP + HTTPS on Different Ports
A single xHttpServer can serve both cleartext HTTP and HTTPS simultaneously:
// HTTP on port 8080
xHttpServerListen(server, "0.0.0.0", 8080);
// HTTPS on port 8443
xTlsConf tls = {
.cert = "server.pem",
.key = "server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
Routes are shared — the same handlers serve both HTTP and HTTPS traffic.
Complete End-to-End Example
A full working example: CA-signed mTLS with server and client.
Generate Certificates
#!/bin/bash
set -e
# CA
openssl req -x509 -newkey rsa:2048 \
-keyout ca-key.pem -out ca.pem \
-days 365 -nodes -subj '/CN=TestCA'
# Server
openssl req -newkey rsa:2048 \
-keyout server-key.pem -out server.csr \
-nodes -subj '/CN=localhost'
openssl x509 -req -in server.csr \
-CA ca.pem -CAkey ca-key.pem -CAcreateserial \
-out server.pem -days 365
rm server.csr
# Client
openssl req -newkey rsa:2048 \
-keyout client-key.pem -out client.csr \
-nodes -subj '/CN=MyClient'
openssl x509 -req -in client.csr \
-CA ca.pem -CAkey ca-key.pem -CAcreateserial \
-out client.pem -days 365
rm client.csr
echo "Generated: ca.pem, server.pem, server-key.pem, client.pem, client-key.pem"
Server Code
#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>
static void on_secure(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
(void)req; (void)arg;
xHttpResponseSetHeader(w, "Content-Type", "text/plain");
xHttpResponseSend(w, "mTLS OK!\n", 9);
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xHttpServer server = xHttpServerCreate(loop);
xHttpServerRoute(server, "GET /secure", on_secure, NULL);
xTlsConf tls = {
.cert = "server.pem",
.key = "server-key.pem",
.ca = "ca.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);
printf("mTLS server listening on :8443\n");
xEventLoopRun(loop);
xHttpServerDestroy(server);
xEventLoopDestroy(loop);
return 0;
}
Client Code
#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>
static void on_response(const xHttpResponse *resp, void *arg) {
(void)arg;
if (resp->curl_code == 0) {
printf("HTTP %ld: %.*s\n", resp->status_code,
(int)resp->body_len, resp->body);
} else {
printf("TLS error: %s\n", resp->curl_error);
}
}
int main(void) {
xEventLoop loop = xEventLoopCreate();
xTlsConf tls = {0};
tls.ca = "ca.pem";
tls.cert = "client.pem";
tls.key = "client-key.pem";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
xHttpClientCreate(loop, &conf);
xHttpClientGet(client, "https://localhost:8443/secure",
on_response, NULL);
xEventLoopRun(loop);
xHttpClientDestroy(client);
xEventLoopDestroy(loop);
return 0;
}
Verify with curl
# One-way TLS (skip verify)
curl -k https://localhost:8443/secure
# One-way TLS (with CA)
curl --cacert ca.pem https://localhost:8443/secure
# mTLS
curl --cacert ca.pem \
--cert client.pem \
--key client-key.pem \
https://localhost:8443/secure
skip_verify Behavior
| Value | Behavior |
|---|---|
0 (default) | Peer verification enabled. Server verifies client cert (if ca is set); client verifies server cert. |
| non-zero | All peer verification disabled. Development only. |
ALPN and HTTP/2 over TLS
When TLS is enabled, ALPN (Application-Layer Protocol Negotiation) automatically selects the HTTP protocol:
- If the client supports HTTP/2, ALPN negotiates
h2and the connection uses HTTP/2 framing. - Otherwise, ALPN falls back to
http/1.1.
This is transparent to application code — the same routes and handlers work regardless of the negotiated protocol.
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
xErrno_NotSupported from ListenTls | No TLS backend compiled | Rebuild with XK_TLS_BACKEND=openssl |
Client gets curl_code != 0, status_code == 0 | TLS handshake failed | Check cert paths, CA trust, and skip_verify settings |
| Self-signed cert rejected | Client verifies against system CA bundle | Set ca to the self-signed cert, or use skip_verify = 1 for dev |
| mTLS handshake fails | Client didn't provide cert, or cert not signed by server's ca | Ensure client cert is signed by the same CA specified in server's ca |
| "wrong CA path" error | ca points to non-existent file | Verify the file path exists and is readable |
Connection works with skip_verify but not without | Server cert CN doesn't match hostname, or CA not trusted | Use ca pointing to the signing CA, ensure CN matches the hostname |
Security Best Practices
- Never use
skip_verifyin production. It disables all certificate validation, making the connection vulnerable to MITM attacks. - Keep private keys secure.
ca-key.pem,server-key.pem, andclient-key.pemshould have restricted file permissions (chmod 600). - Use short-lived certificates. Set reasonable expiry (
-days) and rotate certificates before they expire. - For mTLS, set
caon the server side. Verification is enabled by default (skip_verify = 0), so the server will require a valid client certificate whencais set. - Don't deploy the CA private key. Only
ca.pem(the public certificate) needs to be distributed. Keepca-key.pemoffline or in a secure vault. - Match CN/SAN to hostname. The server certificate's Common Name (or Subject Alternative Name) should match the hostname clients use to connect.
API Quick Reference
Server Side
| Item | Description |
|---|---|
xTlsConf | Struct: cert, key, ca, key_password, alpn, skip_verify |
xHttpServerListenTls() | Start HTTPS listener with TLS config |
Client Side
| Item | Description |
|---|---|
xTlsConf | Struct: cert, key, ca, key_password, alpn, skip_verify |
xHttpClientConf | Struct: tls (pointer to xTlsConf), http_version |
xHttpClientCreate() | Create client with TLS config via xHttpClientConf. |
WebSocket Client Side
| Item | Description |
|---|---|
xTlsConf | Struct: cert, key, ca, key_password, alpn, skip_verify |
xTlsCtx | Opaque shared TLS context from xTlsCtxCreate() |
xWsConnectConf | Struct: tls (pointer to xTlsConf), tls_ctx (shared context, priority over tls) |
xWsConnect() | Initiate async WebSocket connection with optional TLS. |
For full API details, see server.md and client.md.
xlog — Async Logging
Introduction
xlog is xKit's high-performance asynchronous logging module. It formats log entries on the calling thread and flushes them to a file (or stderr) on the event loop thread, decoupling I/O latency from application logic. Three operating modes — Timer, Notify, and Mixed — offer different trade-offs between flush latency and overhead.
Design Philosophy
-
Async by Default — Log messages are formatted on the calling thread and enqueued via a lock-free MPSC queue. The event loop thread drains the queue and writes to disk, ensuring that logging never blocks the caller (except for Fatal level).
-
Three Modes for Different Needs — Timer mode batches writes for throughput; Notify mode uses a pipe for low-latency delivery; Mixed mode combines both, using the timer for normal messages and the pipe for high-severity entries.
-
Event Loop Integration — The logger is bound to an
xEventLoopand uses its timer and I/O facilities. This means no dedicated logging thread — the event loop thread handles both I/O and log flushing. -
Thread-Local Context —
xLoggerEnter()sets the current thread's logger, enabling theXLOG_*()macros and bridging xbase's internalxLog()calls to the async pipeline.
Architecture
graph TD
subgraph "Application Threads"
T1["Thread 1<br/>xLoggerLog()"]
T2["Thread 2<br/>XLOG_INFO()"]
T3["Thread 3<br/>xLog() (xbase internal)"]
end
subgraph "Lock-Free Queue"
MPSC["MPSC Queue<br/>(xbase/mpsc.h)"]
end
subgraph "Event Loop Thread"
TIMER["Timer Callback<br/>(periodic flush)"]
PIPE["Pipe Callback<br/>(immediate flush)"]
FLUSH["logger_flush_entries()"]
WRITE["fwrite() + fflush()"]
ROTATE["File Rotation"]
end
subgraph "Output"
FILE["Log File"]
STDERR["stderr"]
end
T1 -->|"format + enqueue"| MPSC
T2 -->|"format + enqueue"| MPSC
T3 -->|"bridge_callback"| MPSC
MPSC --> FLUSH
TIMER --> FLUSH
PIPE --> FLUSH
FLUSH --> WRITE
WRITE --> FILE
WRITE --> STDERR
WRITE -->|"max_size exceeded"| ROTATE
style MPSC fill:#f5a623,color:#fff
style FLUSH fill:#50b86c,color:#fff
Sub-Module Overview
| File | Description | Doc |
|---|---|---|
logger.h | Async logger API, macros, and configuration | logger.md |
Quick Start
#include <xbase/event.h>
#include <xlog/logger.h>
int main(void) {
xEventLoop loop = xEventLoopCreate();
xLoggerConf conf = {
.loop = loop,
.path = "app.log",
.mode = xLogMode_Mixed,
.level = xLogLevel_Info,
.max_size = 10 * 1024 * 1024, // 10MB
.max_files = 5,
.flush_interval_ms = 100,
};
xLogger logger = xLoggerCreate(conf);
xLoggerEnter(logger); // Set as thread-local logger
XLOG_INFO("Application started, version %d.%d", 1, 0);
XLOG_WARN("Low memory: %zu bytes remaining", (size_t)1024);
// Run event loop (processes log flushes)
xEventLoopRun(loop);
xLoggerLeave();
xLoggerDestroy(logger);
xEventLoopDestroy(loop);
return 0;
}
Relationship with Other Modules
- xbase/event.h — The logger is bound to an
xEventLoopfor timer-driven and pipe-driven flush. - xbase/mpsc.h — Uses the lock-free
MPSC queueto pass log entries from producer threads to the event loop thread. - xbase/log.h —
xLoggerEnter()bridges xbase's internalxLog()calls to the async logger via the thread-local callback mechanism. - xbase/atomic.h — Uses
atomic operationsfor the lock-free entry freelist.
logger.h — High-Performance Async Logger
Introduction
logger.h provides xLogger, a high-performance asynchronous logger that formats log entries on the calling thread and flushes them to a file (or stderr) on the event loop thread. It supports three operating modes (Timer, Notify, Mixed), five severity levels, file rotation, synchronous flush, and seamless bridging with xbase's internal xLog() mechanism.
Design Philosophy
-
Format on Caller, Write on Loop — Log messages are formatted (
snprintf) on the calling thread into a pre-allocated entry buffer, then enqueued via the lock-free MPSC queue. The event loop thread dequeues and writes to disk. This decouples I/O latency from application logic. -
Three Operating Modes — Different applications have different latency/throughput requirements:
- Timer — Periodic flush (default 100ms). Best throughput, highest latency.
- Notify — Pipe-based immediate notification. Lowest latency, highest overhead.
- Mixed — Timer for normal messages, pipe for Error/Fatal. Best balance.
-
Lock-Free Entry Pool — A global Treiber stack freelist recycles log entry structs across all threads, avoiding
malloc/freeon the hot path. -
Fatal = Synchronous + Abort — Fatal-level messages bypass the async queue entirely: they are written directly to the file and followed by
abort(). This ensures the fatal message is never lost. -
xbase Bridge —
xLoggerEnter()registers a callback with xbase'sxLogSetCallback(), routing all internal xKit error messages through the async logger.
Architecture
graph TD
subgraph "xLogger Internal"
MPSC["MPSC Queue<br/>(head, tail)"]
TIMER["xEventLoopTimer<br/>(periodic flush)"]
PIPE["Pipe<br/>(notify flush)"]
FLUSH_PIPE["Flush Request Pipe<br/>(sync flush)"]
FREELIST["Entry Freelist<br/>(Treiber stack)"]
FP["FILE *fp<br/>(log file or stderr)"]
end
subgraph "xbase Dependencies"
EVENT["xEventLoop"]
MPSC_LIB["xbase/mpsc.h"]
ATOMIC_LIB["xbase/atomic.h"]
LOG_LIB["xbase/log.h"]
end
TIMER --> EVENT
PIPE --> EVENT
FLUSH_PIPE --> EVENT
MPSC --> MPSC_LIB
FREELIST --> ATOMIC_LIB
style MPSC fill:#f5a623,color:#fff
style FREELIST fill:#4a90d9,color:#fff
Implementation Details
Three Operating Modes
graph LR
subgraph "Timer Mode"
T_ENQUEUE["Enqueue"] --> T_TIMER["Timer fires<br/>(every 100ms)"]
T_TIMER --> T_FLUSH["Flush all entries"]
end
subgraph "Notify Mode"
N_ENQUEUE["Enqueue"] --> N_PIPE["Write 1 byte to pipe"]
N_PIPE --> N_LOOP["Pipe readable event"]
N_LOOP --> N_FLUSH["Flush all entries"]
end
subgraph "Mixed Mode"
M_ENQUEUE["Enqueue"]
M_ENQUEUE -->|"Debug/Info/Warn"| M_TIMER["Timer fires"]
M_ENQUEUE -->|"Error/Fatal"| M_PIPE["Write to pipe"]
M_TIMER --> M_FLUSH["Flush all entries"]
M_PIPE --> M_FLUSH
end
style T_FLUSH fill:#50b86c,color:#fff
style N_FLUSH fill:#50b86c,color:#fff
style M_FLUSH fill:#50b86c,color:#fff
| Mode | Flush Trigger | Latency | Overhead | Best For |
|---|---|---|---|---|
| Timer | Periodic timer (default 100ms) | Up to flush_interval_ms | Lowest (no per-message syscall) | High-throughput logging |
| Notify | Pipe write per message | ~Immediate | Highest (1 write() per message) | Low-latency debugging |
| Mixed | Timer + pipe for Error/Fatal | Low for errors, batched for info | Moderate | Production applications |
Log Entry Lifecycle
sequenceDiagram
participant App as Application Thread
participant Pool as Entry Freelist
participant Queue as MPSC Queue
participant L as Event Loop Thread
participant File as Log File
App->>Pool: entry_alloc()
Pool-->>App: "xLogEntry_ (recycled or malloc'd)"
App->>App: "snprintf(entry->buf, timestamp + level + message)"
App->>Queue: xMpscPush(entry)
Note over App: "Optional: write(pipe_wfd, 1) for Notify/Mixed"
L->>Queue: "xMpscPop() (timer or pipe callback)"
Queue-->>L: xLogEntry_
L->>File: "fwrite(entry->buf)"
L->>Pool: entry_free(entry)
L->>File: fflush()
Log Entry Structure
struct xLogEntry_ {
xMpsc node; // MPSC queue node
xLogLevel level; // Severity level
int len; // Formatted message length
char buf[XLOG_ENTRY_BUF_SIZE]; // Formatted message (512 bytes)
struct xLogEntry_ *free_next; // Freelist link
};
Lock-Free Entry Freelist
The freelist uses a Treiber stack with atomic CAS:
- Alloc: Pop from freelist head (CAS loop). Fallback to
malloc()if empty. - Free: Push to freelist head (CAS loop). If count exceeds
XLOG_FREELIST_SIZE, callfree()instead.
The count check is intentionally racy (soft cap) to keep the fast path lean.
File Rotation
When written >= max_size and max_files > 1:
- Delete
path.{max_files-1}(oldest) - Cascade rename:
path.{i-1}→path.{i}for i = max_files-1 down to 2 - Rename
path→path.1 - Reopen
pathin append mode
app.log → app.log.1
app.log.1 → app.log.2
app.log.2 → app.log.3
app.log.3 → (deleted if max_files=4)
Synchronous Flush
xLoggerFlush() writes a byte to a dedicated flush-request pipe, triggering logger_flush_req_cb on the event loop thread. The caller then busy-waits (polling xMpscEmpty() every 1ms, up to 1 second) until the queue is drained.
Log Format
2025-04-04 16:30:00.123 INFO Application started
2025-04-04 16:30:00.456 WARN Low memory: 1024 bytes remaining
2025-04-04 16:30:01.789 ERROR Connection refused
Format: YYYY-MM-DD HH:MM:SS.mmm LEVEL message\n
API Reference
Types
| Type | Description |
|---|---|
xLogger | Opaque handle to an async logger |
xLogLevel | Enum: Debug, Info, Warn, Error, Fatal |
xLogMode | Enum: Timer, Notify, Mixed |
xLoggerConf | Configuration struct for creating a logger |
xLoggerConf Fields
| Field | Type | Default | Description |
|---|---|---|---|
loop | xEventLoop | (required) | Event loop for timer/pipe callbacks |
path | const char * | NULL (stderr) | Log file path |
mode | xLogMode | Timer | Operating mode |
level | xLogLevel | Info | Minimum log level |
max_size | size_t | 0 (no rotation) | Max file size before rotation |
max_files | int | 0 (no rotation) | Total files to keep (including current) |
flush_interval_ms | uint64_t | 100 | Timer/Mixed flush interval |
Functions
| Function | Signature | Description | Thread Safety |
|---|---|---|---|
xLoggerCreate | xLogger xLoggerCreate(xLoggerConf conf) | Create a logger. | Not thread-safe |
xLoggerDestroy | void xLoggerDestroy(xLogger logger) | Flush remaining entries and destroy. | Not thread-safe |
xLoggerLog | void xLoggerLog(xLogger logger, xLogLevel level, const char *fmt, ...) | Write a log entry. Fatal is synchronous + abort. | Thread-safe |
xLoggerFlush | void xLoggerFlush(xLogger logger) | Synchronously flush all pending entries. | Thread-safe |
xLoggerEnter | void xLoggerEnter(xLogger logger) | Set as thread-local logger + bridge xbase log. | Thread-local |
xLoggerLeave | void xLoggerLeave(void) | Clear thread-local logger. | Thread-local |
xLoggerCurrent | xLogger xLoggerCurrent(void) | Get current thread's logger. | Thread-local |
Convenience Macros
Using thread-local logger (set via xLoggerEnter):
| Macro | Expands To |
|---|---|
XLOG_DEBUG(fmt, ...) | xLoggerLog(xLoggerCurrent(), xLogLevel_Debug, fmt, ...) |
XLOG_INFO(fmt, ...) | xLoggerLog(xLoggerCurrent(), xLogLevel_Info, fmt, ...) |
XLOG_WARN(fmt, ...) | xLoggerLog(xLoggerCurrent(), xLogLevel_Warn, fmt, ...) |
XLOG_ERROR(fmt, ...) | xLoggerLog(xLoggerCurrent(), xLogLevel_Error, fmt, ...) |
XLOG_FATAL(fmt, ...) | xLoggerLog(xLoggerCurrent(), xLogLevel_Fatal, fmt, ...) |
Explicit logger variants: XLOG_DEBUG_L(logger, fmt, ...), etc.
Usage Examples
Basic File Logging
#include <xbase/event.h>
#include <xlog/logger.h>
int main(void) {
xEventLoop loop = xEventLoopCreate();
xLoggerConf conf = {
.loop = loop,
.path = "app.log",
.mode = xLogMode_Timer,
.level = xLogLevel_Info,
};
xLogger logger = xLoggerCreate(conf);
xLoggerEnter(logger);
XLOG_INFO("Server started on port %d", 8080);
XLOG_DEBUG("This is filtered out (level < Info)");
XLOG_WARN("Connection pool at %d%% capacity", 85);
xEventLoopRun(loop);
xLoggerLeave();
xLoggerDestroy(logger);
xEventLoopDestroy(loop);
return 0;
}
File Rotation Example
xLoggerConf conf = {
.loop = loop,
.path = "/var/log/myapp.log",
.mode = xLogMode_Mixed,
.level = xLogLevel_Info,
.max_size = 50 * 1024 * 1024, // 50MB per file
.max_files = 10, // Keep 10 files (500MB total)
};
Multi-Threaded Logging
#include <pthread.h>
#include <xlog/logger.h>
static xLogger g_logger;
static void *worker(void *arg) {
int id = *(int *)arg;
xLoggerEnter(g_logger); // Each thread must enter
for (int i = 0; i < 1000; i++) {
XLOG_INFO("Worker %d: iteration %d", id, i);
}
xLoggerLeave();
return NULL;
}
// In main():
// g_logger = xLoggerCreate(conf);
// pthread_create(&threads[i], NULL, worker, &ids[i]);
Synchronous Flush Before Exit
void graceful_shutdown(xLogger logger) {
XLOG_INFO("Shutting down...");
xLoggerFlush(logger); // Block until all entries are written
xLoggerDestroy(logger);
}
Use Cases
-
Application Logging — Primary use case: structured, async logging for server applications with file rotation and level filtering.
-
xKit Internal Error Capture — Via
xLoggerEnter(), all xKit internal errors (fromxLog()) are automatically routed through the async logger. -
Debug Logging — Use
xLogMode_Notifyduring development for immediate log output without timer delay.
Best Practices
- Call
xLoggerEnter()on every thread that usesXLOG_*()macros. Each thread needs its own thread-local context. - Use Mixed mode for production. It provides the best balance: batched writes for normal messages, immediate notification for errors.
- Set appropriate rotation limits. Without rotation (
max_size = 0), log files grow unbounded. - Call
xLoggerFlush()before shutdown to ensure all pending messages are written. - Don't log in tight loops at Debug level without checking the level first. While the level filter is cheap, formatting still costs CPU.
- Fatal messages are synchronous.
XLOG_FATAL()writes directly and callsabort(). Don't rely on async delivery for fatal messages.
Comparison with Other Libraries
| Feature | xlog logger.h | spdlog | zlog | log4c |
|---|---|---|---|---|
| Language | C99 | C++11 | C | C |
| Async Model | MPSC queue + event loop | Dedicated thread + queue | Dedicated thread | Synchronous |
| Modes | Timer / Notify / Mixed | Async (thread pool) | Async (thread) | Sync only |
| Lock-Free | Yes (MPSC + Treiber stack) | Yes (MPMC queue) | No (mutex) | No (mutex) |
| Event Loop | Integrated (xEventLoop) | None (own thread) | None (own thread) | None |
| File Rotation | Size-based (cascade rename) | Size-based | Size/time-based | Size-based |
| Format | printf-style | fmt-style / printf | printf-style | printf-style |
| Thread-Local Context | Yes (xLoggerEnter) | No | Yes (MDC) | Yes (NDC) |
| Fatal Handling | Sync write + abort | Flush + abort | Configurable | Configurable |
Key Differentiator: xlog is unique in integrating with an event loop rather than spawning a dedicated logging thread. This means the same thread that handles network I/O also handles log flushing, reducing context switches and thread count. The three-mode design (Timer/Notify/Mixed) gives fine-grained control over the latency/throughput trade-off that most logging libraries don't offer.
Benchmark
End-to-end benchmarks for xKit, measuring real-world performance across complete scenarios.
All benchmarks run on Apple M3 Pro (12 cores, 36 GB), macOS 26.4, Clang 17, Release (-O2).
For micro-benchmark results, see the Benchmark section at the bottom of each module's documentation page.
Available Benchmarks
| Benchmark | Description |
|---|---|
| HTTP Server | xKit single-threaded HTTP/1.1 server vs Go net/http — 152 K req/s, +15–60% faster across all scenarios |
| HTTP/2 Server | xKit single-threaded h2c server vs Go net/http + x/net/http2 — 576 K req/s, +15–405% faster across all scenarios |
| HTTPS Server | xKit single-threaded HTTPS server vs Go net/http + crypto/tls — 512 K req/s (HTTPS/2), TLS-bound parity on HTTPS/1.1 |
HTTP Server Benchmark
End-to-end HTTP/1.1 server benchmark comparing xKit (single-threaded event-loop) against Go net/http (goroutine-per-connection).
Test Environment
| Item | Value |
|---|---|
| CPU | Apple M3 Pro (12 cores) |
| Memory | 36 GB |
| OS | macOS 26.4 (Darwin) |
| Compiler | Apple Clang 17.0.0 |
| Build | Release (-O2) |
| Load Generator | wrk — 4 threads, 10s duration |
Server Implementations
xKit (bench/http_bench_server.cpp)
Single-threaded event-loop HTTP/1.1 server built on xbase/event.h + xhttp/server.h. Uses kqueue on macOS, epoll on Linux. All I/O is handled in one thread — no thread pool, no goroutines.
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
./build/bench/http_bench_server 8080
Go (bench/http_bench_server.go)
Standard net/http server with default settings. Go's runtime spawns one goroutine per connection and uses its own epoll/kqueue poller internally.
go build -o build/bench/go_http_bench bench/http_bench_server.go
./build/bench/go_http_bench 8081
Routes
Both servers implement identical routes:
| Route | Method | Description |
|---|---|---|
/ping | GET | Returns "pong" (4 bytes) — minimal response latency test |
/echo?size=N | GET | Returns N bytes of 'x' — variable response size test |
/echo | POST | Echoes request body — request body throughput test |
Benchmark Methodology
All benchmarks use wrk with the following defaults unless noted:
- 4 threads (
-t4) - 100 connections (
-c100) - 10 seconds (
-d10s)
POST benchmarks use Lua scripts to set the request body:
wrk.method = "POST"
wrk.headers["Content-Type"] = "application/octet-stream"
wrk.body = string.rep("x", BODY_SIZE)
Results
GET /ping — Minimal Response Latency
Tests raw request/response overhead with a 4-byte "pong" response. Varies connection count to measure scalability.
| Connections | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 50 | 151,935 | 128,639 | 315 μs | 365 μs | xKit +18% |
| 100 | 152,316 | 128,915 | 658 μs | 761 μs | xKit +18% |
| 200 | 151,007 | 128,162 | 1.33 ms | 1.55 ms | xKit +18% |
| 500 | 155,486 | 125,471 | 3.20 ms | 3.96 ms | xKit +24% |
Analysis:
- xKit maintains ~152K req/s regardless of connection count, showing excellent scalability of the single-threaded event loop.
- Go's throughput slightly degrades at 500 connections due to goroutine scheduling overhead.
- xKit's advantage grows from +18% to +24% as connection count increases — the event loop's O(1) dispatch scales better than goroutine context switching.
GET /echo — Variable Response Size
Tests response serialization throughput with different payload sizes. Fixed at 100 connections.
| Response Size | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 64 B | 150,592 | 127,432 | 666 μs | 771 μs | xKit +18% |
| 256 B | 146,487 | 126,907 | 682 μs | 774 μs | xKit +15% |
| 1 KiB | 144,831 | 125,729 | 689 μs | 785 μs | xKit +15% |
| 4 KiB | 141,511 | 91,886 | 707 μs | 1.08 ms | xKit +54% |
Analysis:
- xKit throughput degrades gracefully from 151K to 142K req/s as response size grows from 64B to 4KB — only a 6% drop.
- Go drops sharply at 4KB (92K req/s, -27% from 64B), likely due to
bytes.Repeatallocation pressure and GC overhead. - xKit's largest advantage (+54%) appears at 4KB, where Go's per-request heap allocation becomes the bottleneck.
POST /echo — Request Body Throughput
Tests request body parsing and echo throughput. Fixed at 100 connections.
| Body Size | xKit Req/s | Go Req/s | xKit Transfer/s | Go Transfer/s | Δ |
|---|---|---|---|---|---|
| 1 KiB | 141,495 | 122,584 | 152.35 MB/s | 133.51 MB/s | xKit +15% |
| 4 KiB | 133,935 | 83,512 | 536.60 MB/s | 337.13 MB/s | xKit +60% |
| 16 KiB | 82,231 | 53,828 | 1.26 GB/s | 848.10 MB/s | xKit +53% |
| 64 KiB | 35,908 | 31,124 | 2.20 GB/s | 1.90 GB/s | xKit +15% |
Analysis:
- xKit achieves 2.20 GB/s transfer rate at 64KB body size — impressive for a single-threaded server.
- The largest advantage (+60%) appears at 4KB, consistent with the GET /echo pattern — Go's allocation overhead dominates at medium payload sizes.
- At 64KB, the gap narrows to +15% as both servers become I/O bound (kernel socket buffer management dominates).
Summary
xKit vs Go net/http (Release build)
====================================
GET /ping: xKit +18% ~ +24% (consistent across all concurrency levels)
GET /echo: xKit +15% ~ +54% (advantage grows with response size)
POST /echo: xKit +15% ~ +60% (advantage peaks at medium body sizes)
Peak throughput: xKit 155K req/s (GET /ping, 500 connections)
Peak transfer: xKit 2.20 GB/s (POST /echo, 64KB body)
Key Takeaways:
- xKit wins every scenario. A single-threaded C event loop outperforms Go's multi-goroutine runtime across all request types and payload sizes.
- Scalability. xKit's throughput is nearly flat from 50 to 500 connections. Go degrades under high connection counts due to goroutine scheduling overhead.
- Payload efficiency. xKit's advantage is most pronounced at medium payloads (1–4 KiB) where Go's per-request heap allocation and GC pressure become significant.
- Architecture matters. xKit's single-threaded design eliminates all synchronization overhead. Go pays for goroutine creation, scheduling, and garbage collection on every request.
Reproducing
# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
# Build Go server
go build -o build/bench/go_http_bench bench/http_bench_server.go
# Run xKit benchmark
./build/bench/http_bench_server 8080 &
wrk -t4 -c100 -d10s http://127.0.0.1:8080/ping
wrk -t4 -c100 -d10s "http://127.0.0.1:8080/echo?size=64"
wrk -t4 -c100 -d10s "http://127.0.0.1:8080/echo?size=4096"
# POST with lua script
cat > /tmp/post.lua << 'EOF'
wrk.method = "POST"
wrk.headers["Content-Type"] = "application/octet-stream"
wrk.body = string.rep("x", 4096)
EOF
wrk -t4 -c100 -d10s -s /tmp/post.lua http://127.0.0.1:8080/echo
# Run Go benchmark (same wrk commands, different port)
./build/bench/go_http_bench 8081 &
wrk -t4 -c100 -d10s http://127.0.0.1:8081/ping
HTTP/2 Server Benchmark
End-to-end HTTP/2 (h2c, cleartext) server benchmark comparing xKit (single-threaded event-loop) against Go net/http + x/net/http2/h2c (goroutine-per-connection).
Test Environment
| Item | Value |
|---|---|
| CPU | Apple M3 Pro (12 cores) |
| Memory | 36 GB |
| OS | macOS 26.4 (Darwin) |
| Compiler | Apple Clang 17.0.0 |
| Build | Release (-O2) |
| Load Generator | h2load (nghttp2 1.68.1) — 4 threads, 10s duration, 10 max concurrent streams per connection |
Server Implementations
xKit (bench/http_bench_server.cpp)
Single-threaded event-loop HTTP/2 server built on xbase/event.h + xhttp/server.h. Supports h2c (cleartext HTTP/2) via Prior Knowledge — the same binary as the HTTP/1.1 benchmark, since xKit auto-detects the protocol on the first bytes of each connection.
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
./build/bench/http_bench_server 8080
Go (bench/h2c_bench_server.go)
Standard net/http server wrapped with golang.org/x/net/http2/h2c.NewHandler() to support cleartext HTTP/2 via Prior Knowledge. Go's runtime spawns one goroutine per connection and uses its own epoll/kqueue poller internally.
cd bench && go build -o ../build/bench/go_h2c_bench h2c_bench_server.go
./build/bench/go_h2c_bench 8081
Routes
Both servers implement identical routes:
| Route | Method | Description |
|---|---|---|
/ping | GET | Returns "pong" (4 bytes) — minimal response latency test |
/echo?size=N | GET | Returns N bytes of 'x' — variable response size test |
/echo | POST | Echoes request body — request body throughput test |
Benchmark Methodology
All benchmarks use h2load with the following defaults unless noted:
- 4 threads (
-t4) - 100 connections (
-c100) - 10 max concurrent streams per connection (
-m10) - 10 seconds (
-D 10)
POST benchmarks use -d <file> to specify the request body.
Why h2load? Unlike wrk (HTTP/1.1 only), h2load is purpose-built for HTTP/2 benchmarking. It supports stream multiplexing (
-m), h2c Prior Knowledge, and reports per-stream latency.
Results
GET /ping — Minimal Response Latency
Tests raw request/response overhead with a 4-byte "pong" response. Varies connection count to measure scalability under HTTP/2 multiplexing.
| Connections | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 50 | 576,249 | 141,655 | 863 μs | 3.51 ms | xKit +307% |
| 100 | 561,825 | 120,732 | 1.78 ms | 8.27 ms | xKit +365% |
| 200 | 555,800 | 110,143 | 3.59 ms | 18.10 ms | xKit +405% |
| 500 | 538,905 | 136,719 | 9.22 ms | 36.21 ms | xKit +294% |
Analysis:
- xKit sustains ~560K req/s across all connection counts — a massive improvement over its HTTP/1.1 numbers (~152K) thanks to HTTP/2 stream multiplexing on fewer TCP connections.
- Go's h2c throughput (~110–142K) is comparable to its HTTP/1.1 numbers, suggesting Go's HTTP/2 implementation doesn't benefit as much from multiplexing.
- xKit's advantage ranges from +294% to +405% — far larger than the +18–24% gap seen in HTTP/1.1. The single-threaded event loop excels at handling multiplexed streams without context-switching overhead.
- At 200 connections, xKit's advantage peaks at +405%. Go's throughput degrades more steeply under high connection counts due to goroutine scheduling and HTTP/2 flow control overhead.
GET /echo — Variable Response Size
Tests response serialization throughput with different payload sizes under HTTP/2 framing. Fixed at 100 connections.
| Response Size | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 64 B | 518,176 | 123,386 | 1.92 ms | 8.08 ms | xKit +320% |
| 256 B | 511,276 | 116,267 | 1.95 ms | 8.60 ms | xKit +340% |
| 1 KiB | 493,405 | 115,267 | 2.03 ms | 8.64 ms | xKit +328% |
| 4 KiB | 383,507 | 107,457 | 2.59 ms | 9.23 ms | xKit +257% |
Analysis:
- xKit throughput degrades gracefully from 518K to 384K req/s as response size grows from 64B to 4KB — a 26% drop, mostly due to HTTP/2 DATA frame serialization overhead.
- Go stays relatively flat (~107–123K) but at a much lower baseline. The
bytes.Repeatallocation + GC pressure is compounded by HTTP/2 framing overhead. - xKit's advantage is consistently +257% to +340% — HTTP/2's HPACK header compression and binary framing amplify xKit's architectural advantage over Go.
POST /echo — Request Body Throughput
Tests request body parsing and echo throughput under HTTP/2. Fixed at 100 connections.
| Body Size | xKit Req/s | Go Req/s | xKit Transfer/s | Go Transfer/s | Δ |
|---|---|---|---|---|---|
| 1 KiB | 401,047 | 119,739 | 399.45 MB/s | 119.82 MB/s | xKit +235% |
| 4 KiB | 195,221 | 90,585 | 766.61 MB/s | 356.84 MB/s | xKit +115% |
| 16 KiB | 57,304 | 41,313 | 896.83 MB/s | 648.24 MB/s | xKit +39% |
| 64 KiB | 19,040 | 16,557 | 1.16 GB/s | 1.01 GB/s | xKit +15% |
Analysis:
- xKit achieves 1.16 GB/s transfer rate at 64KB body size — comparable to its HTTP/1.1 performance (2.20 GB/s), with the difference attributable to HTTP/2 flow control and framing overhead.
- The advantage narrows from +235% (1KB) to +15% (64KB) as both servers become I/O bound. HTTP/2 flow control (default 64KB window) becomes the bottleneck at large payloads.
- At small payloads (1KB), xKit's +235% advantage shows the efficiency of its nghttp2-based H2 implementation vs Go's
x/net/http2.
HTTP/2 vs HTTP/1.1 Comparison
How does HTTP/2 compare to HTTP/1.1 for each server? (GET /ping, 100 connections)
| Server | HTTP/1.1 Req/s | HTTP/2 Req/s | Δ |
|---|---|---|---|
| xKit | 152,316 | 561,825 | +269% |
| Go | 128,915 | 120,732 | −6% |
Key Insight: xKit's single-threaded event loop benefits enormously from HTTP/2 multiplexing — handling multiple streams on fewer connections eliminates per-connection overhead. Go's goroutine-per-connection model doesn't gain from multiplexing because it already handles concurrency at the goroutine level; the added HTTP/2 framing overhead actually causes a slight regression.
Summary
xKit vs Go h2c (Release build, h2load -m10)
=============================================
GET /ping: xKit +294% ~ +405% (massive advantage across all concurrency)
GET /echo: xKit +257% ~ +340% (consistent across all response sizes)
POST /echo: xKit +15% ~ +235% (advantage narrows as payloads grow)
Peak throughput: xKit 576K req/s (GET /ping, 50 connections)
Peak transfer: xKit 1.16 GB/s (POST /echo, 64KB body)
Key Takeaways:
- HTTP/2 amplifies xKit's advantage. The gap widens from +18–24% (HTTP/1.1) to +294–405% (HTTP/2) on GET /ping. Stream multiplexing plays to the strengths of a single-threaded event loop.
- xKit scales with multiplexing. xKit's throughput jumps from 152K (HTTP/1.1) to 576K (HTTP/2) req/s — a 3.8× improvement. Go's throughput stays flat or slightly regresses.
- Payload efficiency. At small-to-medium payloads, xKit's nghttp2-based H2 implementation is dramatically faster. At large payloads (64KB), both servers converge as I/O and flow control dominate.
- Architecture matters even more for H2. HTTP/2's stream multiplexing, HPACK compression, and flow control add complexity that a lean C event loop handles more efficiently than Go's runtime.
Reproducing
# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
# Build Go h2c server
cd bench && go build -o ../build/bench/go_h2c_bench h2c_bench_server.go && cd ..
# Install h2load (macOS)
brew install nghttp2
# Start servers
./build/bench/http_bench_server 8080 &
./build/bench/go_h2c_bench 8081 &
# GET /ping benchmark
h2load -t4 -c100 -m10 -D 10 http://127.0.0.1:8080/ping
h2load -t4 -c100 -m10 -D 10 http://127.0.0.1:8081/ping
# GET /echo benchmark
h2load -t4 -c100 -m10 -D 10 "http://127.0.0.1:8080/echo?size=1024"
h2load -t4 -c100 -m10 -D 10 "http://127.0.0.1:8081/echo?size=1024"
# POST /echo benchmark (create body file first)
dd if=/dev/zero bs=4096 count=1 | tr '\0' 'x' > /tmp/body_4k.bin
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin http://127.0.0.1:8080/echo
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin http://127.0.0.1:8081/echo
# Cleanup
pkill -f http_bench_server
pkill -f go_h2c_bench
HTTPS Server Benchmark
End-to-end HTTPS server benchmark comparing xKit (single-threaded event-loop, OpenSSL) against Go net/http + crypto/tls (goroutine-per-connection). Tests both HTTPS/1.1 (wrk) and HTTPS/2 (h2load with ALPN).
Test Environment
| Item | Value |
|---|---|
| CPU | Apple M3 Pro (12 cores) |
| Memory | 36 GB |
| OS | macOS 26.4 (Darwin) |
| Compiler | Apple Clang 17.0.0 |
| Build | Release (-O2) |
| TLS Backend | OpenSSL 3.6.1 (xKit), Go crypto/tls (Go) |
| Certificate | RSA 2048-bit self-signed, TLS 1.3 |
| Load Generator | wrk (HTTP/1.1 over TLS), h2load (HTTP/2 over TLS with ALPN) |
Server Implementations
xKit (bench/https_bench_server.cpp)
Single-threaded event-loop HTTPS server built on xbase/event.h + xhttp/server.h + OpenSSL. Uses xHttpServerListenTls() which automatically sets ALPN to {"h2", "http/1.1"}, so the same server handles both HTTPS/1.1 and HTTPS/2 depending on client negotiation.
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
openssl req -x509 -newkey rsa:2048 -keyout bench_key.pem -out bench_cert.pem \
-days 365 -nodes -subj '/CN=localhost'
./build/bench/https_bench_server 8443 bench_cert.pem bench_key.pem
Go (bench/https_bench_server.go)
Standard net/http server with crypto/tls and x/net/http2.ConfigureServer(). Go's TLS implementation is in pure Go (crypto/tls), while xKit uses OpenSSL's C implementation. Both servers configure ALPN for h2 and http/1.1.
cd bench && go build -o ../build/bench/go_https_bench https_bench_server.go
./build/bench/go_https_bench 8444 bench_cert.pem bench_key.pem
Routes
Both servers implement identical routes:
| Route | Method | Description |
|---|---|---|
/ping | GET | Returns "pong" (4 bytes) — minimal response latency test |
/echo?size=N | GET | Returns N bytes of 'x' — variable response size test |
/echo | POST | Echoes request body — request body throughput test |
Results
HTTPS/1.1 — GET /ping (wrk, varying connections)
Tests HTTPS/1.1 performance where each connection maintains its own TLS session. wrk reuses connections (no per-request handshake), so this measures encrypted request/response throughput.
| Connections | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 50 | 125,147 | 125,076 | 395 μs | 372 μs | ≈ 0% |
| 100 | 124,593 | 128,277 | 0.86 ms | 764 μs | Go +3% |
| 200 | 122,837 | 127,075 | 1.88 ms | 1.57 ms | Go +3% |
| 500 | 111,397 | 122,498 | 5.25 ms | 4.06 ms | Go +10% |
Analysis:
- Under HTTPS/1.1, xKit and Go are nearly identical at low connection counts (~125K req/s each). This is a dramatic contrast to plaintext HTTP/1.1 where xKit was +18–24% faster.
- TLS encryption is the bottleneck, not the HTTP layer. OpenSSL's AES-GCM encryption on a single thread saturates at ~125K req/s regardless of the HTTP framework above it.
- At 500 connections, Go pulls ahead by ~10% because Go's multi-threaded runtime can parallelize TLS encryption across all CPU cores, while xKit's single-threaded event loop is limited to one core for both TLS and HTTP processing.
- xKit's latency is slightly higher at high connection counts (5.25 ms vs 4.06 ms at 500 connections) — the single thread must serialize all TLS encrypt/decrypt operations.
HTTPS/2 — GET /ping (h2load, varying connections)
Tests HTTPS/2 performance with TLS + ALPN negotiation. HTTP/2 multiplexing reduces the number of TLS sessions needed, which should benefit the single-threaded xKit.
| Connections | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 50 | 511,586 | 165,341 | 975 μs | 2.99 ms | xKit +209% |
| 100 | 508,685 | 144,024 | 1.96 ms | 6.88 ms | xKit +253% |
| 200 | 497,775 | 131,749 | 4.01 ms | 15.00 ms | xKit +278% |
Analysis:
- With HTTPS/2, xKit regains its massive advantage: +209% to +278% over Go. HTTP/2 multiplexing means fewer TLS sessions are needed — multiple streams share one encrypted connection, so the TLS overhead is amortized.
- xKit achieves ~510K req/s over HTTPS/2 — only ~10% less than its h2c (cleartext HTTP/2) performance of 562K. The TLS overhead is minimal when amortized across multiplexed streams.
- Go's HTTPS/2 throughput (~131–165K) is comparable to its h2c numbers (~121–142K), suggesting Go's TLS overhead is also well-amortized but the HTTP/2 processing itself is the bottleneck.
HTTPS/2 — GET /echo (h2load, varying response size)
Tests response serialization + TLS encryption throughput with different payload sizes. Fixed at 100 connections.
| Response Size | xKit Req/s | Go Req/s | xKit Latency | Go Latency | Δ |
|---|---|---|---|---|---|
| 64 B | 470,607 | 146,727 | 2.11 ms | 6.74 ms | xKit +221% |
| 1 KiB | 388,828 | 140,926 | 2.56 ms | 6.99 ms | xKit +176% |
| 4 KiB | 227,414 | 118,595 | 4.38 ms | 8.22 ms | xKit +92% |
Analysis:
- xKit's advantage narrows as response size grows (from +221% at 64B to +92% at 4KB) because TLS encryption of larger payloads becomes a bigger fraction of total work.
- At 4KB responses, xKit still achieves 893 MB/s encrypted throughput vs Go's 466 MB/s.
HTTPS/2 — POST /echo (h2load, varying body size)
Tests request body parsing + TLS decryption/encryption throughput. Fixed at 100 connections.
| Body Size | xKit Req/s | Go Req/s | xKit Transfer/s | Go Transfer/s | Δ |
|---|---|---|---|---|---|
| 1 KiB | 291,086 | 146,916 | 289.93 MB/s | 147.01 MB/s | xKit +98% |
| 4 KiB | 128,229 | 104,892 | 503.54 MB/s | 413.20 MB/s | xKit +22% |
| 16 KiB | 38,975 | 37,391 | 609.97 MB/s | 586.70 MB/s | xKit +4% |
| 64 KiB | 10,278 | 14,994 | 643.30 MB/s | 939.77 MB/s | Go +46% |
Analysis:
- At small payloads (1KB), xKit is +98% faster. At medium payloads (4KB), the gap narrows to +22%.
- At 16KB, the two are nearly tied (+4%). At 64KB, Go wins by +46% — this is the first scenario where Go decisively beats xKit.
- The 64KB crossover happens because: (1) TLS encryption of 64KB payloads is CPU-intensive and benefits from Go's multi-core parallelism, (2) HTTP/2 flow control window (default 64KB) creates back-pressure that the single-threaded event loop handles less efficiently than Go's goroutine scheduler.
Protocol Comparison
How does TLS affect performance for each protocol? (GET /ping, 100 connections)
| Server | HTTP/1.1 | HTTPS/1.1 | Δ (TLS cost) |
|---|---|---|---|
| xKit | 152,316 | 124,593 | −18% |
| Go | 128,915 | 128,277 | −0.5% |
| Server | h2c | HTTPS/2 | Δ (TLS cost) |
|---|---|---|---|
| xKit | 561,825 | 508,685 | −9% |
| Go | 120,732 | 144,024 | +19% |
Key Insights:
- TLS costs xKit 18% on HTTP/1.1 because every connection requires its own TLS session, and all encryption runs on a single thread. Go's multi-core TLS is essentially free (−0.5%).
- TLS costs xKit only 9% on HTTP/2 because multiplexed streams share TLS sessions. This is why HTTPS/2 is xKit's sweet spot.
- Go actually gets faster with HTTPS/2 vs h2c (+19%) — likely because TLS session caching and ALPN negotiation provide a more optimized code path in Go's
crypto/tls+x/net/http2stack.
Summary
xKit vs Go HTTPS (Release build, OpenSSL 3.6.1)
=================================================
HTTPS/1.1 (wrk):
GET /ping: Go ≈ xKit (−0% to +10% Go advantage at high connections)
GET /echo 1KB: Go +10%
HTTPS/2 (h2load -m10):
GET /ping: xKit +209% ~ +278%
GET /echo: xKit +92% ~ +221%
POST /echo: xKit +98% (1KB) → Go +46% (64KB)
Peak throughput: xKit 512K req/s (HTTPS/2 GET /ping, 50 connections)
Peak transfer: Go 940 MB/s (HTTPS/2 POST /echo, 64KB body)
Key Takeaways:
- HTTPS/1.1 is TLS-bound. Single-threaded OpenSSL encryption caps xKit at ~125K req/s — the same as Go. The HTTP framework advantage disappears when TLS dominates.
- HTTPS/2 restores xKit's advantage. Stream multiplexing amortizes TLS overhead across streams, letting xKit's efficient event loop shine again (+209–278% on GET /ping).
- Large payloads favor Go. At 64KB POST bodies, Go's multi-core TLS parallelism wins by +46%. This is the only scenario where Go decisively beats xKit.
- Choose your protocol wisely. For latency-sensitive APIs with small payloads, HTTPS/2 + xKit is optimal. For bulk data transfer, Go's multi-core TLS is more efficient.
Reproducing
# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
# Build Go HTTPS server
cd bench && go build -o ../build/bench/go_https_bench https_bench_server.go && cd ..
# Generate self-signed certificate
openssl req -x509 -newkey rsa:2048 -keyout /tmp/bench_key.pem \
-out /tmp/bench_cert.pem -days 365 -nodes -subj '/CN=localhost'
# Install tools (macOS)
brew install wrk nghttp2
# Start servers
./build/bench/https_bench_server 8443 /tmp/bench_cert.pem /tmp/bench_key.pem &
./build/bench/go_https_bench 8444 /tmp/bench_cert.pem /tmp/bench_key.pem &
# HTTPS/1.1 benchmark (wrk)
wrk -t4 -c100 -d10s https://127.0.0.1:8443/ping
wrk -t4 -c100 -d10s https://127.0.0.1:8444/ping
# HTTPS/2 benchmark (h2load)
h2load -t4 -c100 -m10 -D 10 https://127.0.0.1:8443/ping
h2load -t4 -c100 -m10 -D 10 https://127.0.0.1:8444/ping
# POST benchmark
dd if=/dev/zero bs=4096 count=1 | tr '\0' 'x' > /tmp/body_4k.bin
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin https://127.0.0.1:8443/echo
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin https://127.0.0.1:8444/echo
# Cleanup
pkill -f https_bench_server
pkill -f go_https_bench
TODO
Planning and feasibility analysis for future improvements.
- 移除 libcurl 依赖 — 分析移除 xhttp 对 libcurl 的依赖的可行性、收益与折中方案
移除 libcurl 依赖的可行性与收益分析
一、当前 libcurl 的使用范围
libcurl 仅被 HTTP Client 部分使用,涉及以下文件:
| 文件 | 依赖程度 | 说明 |
|---|---|---|
client.c | 核心 | 整个文件围绕 curl_multi / curl_easy 构建 |
client.h | API 层 | xHttpResponse 暴露了 curl_code / curl_error |
client_private.h | 核心 | CURL *easy、CURLM *multi、CURLcode、CURL_ERROR_SIZE |
sse.c | 核心 | SSE 流式传输完全基于 curl write callback |
xhttp/CMakeLists.txt | 构建 | Libcurl::Libcurl 链接 |
CMakeLists.txt (顶层) | 构建 | 整个 xhttp 模块的编译以 Libcurl_FOUND 为前提 |
不依赖 curl 的部分(占 xhttp 模块的大部分):
- HTTP Server(
server.c、proto_h1.c、proto_h2.c)→ 用 llhttp + nghttp2 - WebSocket Server(
ws.c、ws_serve.c、ws_handshake_server.c) - WebSocket Client(
ws_connect.c、ws_handshake_client.c)→ 纯 socket + xEventLoop - Transport 层(
transport_*.c)→ 纯 OpenSSL / mbedTLS - WS Frame / Deflate / Crypto
二、libcurl 提供了什么
libcurl 在 xhttp client 中承担了以下职责:
graph TD
A[libcurl 提供的能力] --> B[HTTP/1.1 协议解析<br/>请求序列化 + 响应解析]
A --> C[HTTP/2 协议支持<br/>HPACK, 流复用, 帧处理]
A --> D[TLS 握手管理<br/>证书验证, ALPN 协商]
A --> E[Multi-Socket API<br/>非阻塞 I/O 集成]
A --> F[连接池 / Keep-Alive<br/>DNS 缓存]
A --> G[Chunked Transfer<br/>Content-Encoding 解压]
A --> H[重定向跟随<br/>Cookie 管理]
A --> I[代理支持<br/>SOCKS / HTTP proxy]
三、替换方案分析
如果移除 libcurl,需要自己实现 HTTP Client 协议栈:
| 需要自建的组件 | 复杂度 | 说明 |
|---|---|---|
| HTTP/1.1 请求序列化 | ⭐ 低 | 手动拼 GET /path HTTP/1.1\r\n... |
| HTTP/1.1 响应解析 | ⭐⭐ 中 | 可复用已有的 llhttp(server 已在用) |
| Chunked Transfer Decoding | ⭐⭐ 中 | llhttp 可处理 |
| TLS 客户端握手 | ⭐⭐ 中 | WS Client 已有 transport_tls_client_openssl/mbedtls,可复用 |
| HTTP/2 客户端 | ⭐⭐⭐⭐ 高 | 需要 nghttp2 的 client session API(server 已用 nghttp2,但 client 模式不同) |
| 连接池 / Keep-Alive | ⭐⭐⭐ 高 | 需要自己管理连接复用、idle timeout |
| Multi-Socket 事件集成 | ⭐⭐ 中 | 已有 xEventLoop,但需要自己管理连接状态机 |
| DNS 异步解析 | ⭐⭐⭐ 高 | curl 内置 c-ares 集成,自建需要额外依赖或阻塞 |
| 重定向 / Cookie / Proxy | ⭐⭐ 中 | 按需实现 |
四、收益分析
✅ 收益
-
减少外部依赖
- 当前 xhttp 模块需要 libcurl(~600KB 动态库),移除后减少一个系统级依赖
- 嵌入式 / 交叉编译场景更友好(libcurl 的交叉编译配置较复杂)
-
统一 TLS 管理
- 目前 HTTP Client 的 TLS 由 curl 内部管理(
CURLOPT_CAINFO等),与 xnet/xhttp 其他部分的xTlsCtx体系割裂 - 移除后可统一使用
xTlsCtx共享模式,与 TCP/WS Client/HTTP Server 一致
- 目前 HTTP Client 的 TLS 由 curl 内部管理(
-
消除 API 泄漏
xHttpResponse中的curl_code/curl_error是 curl 特有概念,暴露给用户不够抽象- 移除后可用
xErrno统一错误体系
-
减小二进制体积
- 对于只用 server 或 WS 的场景,不再需要链接 curl
-
更精细的控制
- 连接池策略、超时行为、buffer 管理等可以完全自定义
❌ 代价
-
工作量巨大(估算 2000-3000 行新代码)
- HTTP/1.1 Client 协议栈:~500 行
- HTTP/2 Client(nghttp2 client session):~800 行
- 连接池 + Keep-Alive 管理:~500 行
- SSE 重新集成:~300 行
- DNS 解析:~200 行(或引入 c-ares)
- 测试重写:~500 行
-
HTTP/2 Client 是最大难点
- nghttp2 的 client API 与 server API 差异大,需要处理 SETTINGS、WINDOW_UPDATE、流优先级等
- curl 内部对 nghttp2 client 做了大量边界处理
-
失去 curl 的成熟度
- libcurl 经过 25+ 年打磨,处理了无数 HTTP 边界情况(畸形响应、各种 Transfer-Encoding、代理认证等)
- 自建实现短期内很难达到同等健壮性
-
维护负担增加
- HTTP 协议的 edge case 很多,自建意味着长期维护成本
五、折中方案
如果目标是减少依赖但不完全重写,有几个渐进路径:
graph LR
A[当前状态<br/>curl 必选] --> B[方案1: curl 可选<br/>有 curl 用 curl<br/>无 curl 用内置 H1]
A --> C[方案2: 仅移除 H2 Client<br/>内置 H1 Client<br/>H2 仍用 curl]
A --> D[方案3: 完全移除<br/>内置 H1 + H2 Client]
B --> E[工作量: ~800行<br/>风险: 低]
C --> F[工作量: ~600行<br/>风险: 低]
D --> G[工作量: ~2500行<br/>风险: 高]
推荐方案1:让 curl 变为可选依赖
- 新增一个轻量的内置 HTTP/1.1 Client(基于已有的 llhttp +
transport_tls_client+ xEventLoop) - 有 curl 时用 curl(支持 H2、连接池等高级特性)
- 无 curl 时 fallback 到内置 H1 Client(覆盖 80% 的使用场景)
- HTTP Server、WS Server/Client 完全不受影响(它们本来就不依赖 curl)
这样可以:
- 让 xhttp 模块在无 curl 环境下也能编译(server + ws + 基础 client)
- 保留 curl 作为增强选项(H2 client、连接池、代理等)
- 统一 TLS 管理(内置 client 用
xTlsCtx) - 逐步迁移,风险可控
六、结论
| 维度 | 完全移除 | 可选依赖(推荐) |
|---|---|---|
| 工作量 | ~2500 行 + 测试重写 | ~800 行 |
| 风险 | 高(H2 client 复杂) | 低(H1 only,复用现有组件) |
| 收益 | 零外部依赖 | 无 curl 也能用,有 curl 更强 |
| API 变化 | 需要重新设计 Response | 可以抽象一层,渐进迁移 |
| 时间 | 2-3 周 | 3-5 天 |
建议:先做方案1(curl 可选),把 HTTP Server / WS 从 curl 依赖中解耦出来(实际上它们已经解耦了,只是 CMake 层面整个 xhttp 模块被 curl 门控了)。然后再根据实际需求决定是否进一步移除 curl。