xKit

Welcome to the xKit documentation. xKit is a collection of low-level C building blocks for event-driven, asynchronous programming on macOS and Linux. (Windows is on the roadmap but not a near-term priority).

  • Designed and reviewed by Leo X.
  • Coded by Codebuddy with claude-4.6-opus

Architecture Overview

graph TD
    subgraph "Application Layer"
        APP["User Application"]
    end

    subgraph "High-Level Modules"
        XHTTP["xhttp<br/>HTTP Client &amp; Server &amp; WebSocket"]
        XLOG["xlog<br/>Async Logging"]
    end

    subgraph "Networking Layer"
        XNET["xnet<br/>URL / DNS / TLS Config / TCP"]
    end

    subgraph "Buffer Layer"
        XBUF["xbuf<br/>Buffer Primitives"]
    end

    subgraph "Core Layer"
        XBASE["xbase<br/>Core Primitives"]
    end

    APP --> XHTTP
    APP --> XLOG
    APP --> XNET
    APP --> XBUF
    APP --> XBASE
    XHTTP --> XNET
    XHTTP --> XBASE
    XHTTP --> XBUF
    XNET --> XBASE
    XLOG --> XBASE
    XBUF -->|"atomic.h"| XBASE

    style XBASE fill:#50b86c,color:#fff
    style XBUF fill:#4a90d9,color:#fff
    style XNET fill:#e74c3c,color:#fff
    style XHTTP fill:#f5a623,color:#fff
    style XLOG fill:#9b59b6,color:#fff

Module Index

xbase — Core Primitives

The foundation of xKit. Provides event loop, timers, tasks, async sockets, memory management, and lock-free data structures.

Sub-ModuleDescription
event.hCross-platform event loop — kqueue (macOS) / epoll (Linux) / poll (fallback)
timer.hMonotonic timer with Push (thread-pool) and Poll (lock-free MPSC) fire modes
task.hN:M task model — lightweight tasks multiplexed onto a thread pool
socket.hAsync socket abstraction with idle-timeout support
memory.hReference-counted allocation with vtable-driven lifecycle
error.hUnified error codes and human-readable messages
heap.hMin-heap with index tracking (used by timer subsystem)
mpsc.hLock-free multi-producer / single-consumer queue
atomic.hCompiler-portable atomic operations (GCC/Clang builtins)
log.hPer-thread callback-based logging with optional backtrace
backtrace.hPlatform-adaptive stack trace (libunwind > execinfo > stub)
time.hTime utilities: xMonoMs() (monotonic) and xWallMs() (wall-clock)

xbuf — Buffer Primitives

Three buffer types for different I/O patterns — linear, ring, and block-chain.

Sub-ModuleDescription
buf.hLinear auto-growing byte buffer with 2× expansion
ring.hFixed-size ring buffer with power-of-2 mask indexing
io.hReference-counted block-chain I/O buffer with zero-copy split/cut

xnet — Networking Primitives

Shared networking utilities: URL parser, async DNS resolver, and TLS configuration types used by higher-level modules.

Sub-ModuleDescription
url.hLightweight URL parser with zero-copy component extraction
dns.hAsync DNS resolution via thread-pool offload
tls.hShared TLS configuration types (client & server)
tcp.hAsync TCP connection, connector & listener with optional TLS

xhttp — Async HTTP Client & Server & WebSocket

Full-featured async HTTP framework: libcurl-powered client with SSE streaming, event-driven server with HTTP/1.1 & HTTP/2 (h2c), TLS support (OpenSSL / mbedTLS), and RFC 6455 WebSocket (server & client).

Sub-ModuleDescription
client.hAsync HTTP client (GET / POST / PUT / DELETE / PATCH / HEAD)
sse.cSSE streaming client with W3C-compliant event parsing
server.hEvent-driven HTTP server with HTTP/1.1 and HTTP/2 (h2c)
ws.hRFC 6455 WebSocket server with handler-initiated upgrade
ws.hRFC 6455 WebSocket client with async connect
transport.hPluggable TLS transport layer (OpenSSL / mbedTLS / plain)

xlog — Async Logging

High-performance async logger with MPSC queue, three flush modes, and file rotation.

Sub-ModuleDescription
logger.hAsync logger with Timer / Notify / Mixed modes and XLOG_* macros

bench — End-to-End Benchmarks

End-to-end benchmark results comparing xKit against other frameworks in real-world scenarios.

BenchmarkDescription
HTTP/1.1 ServerxKit single-threaded HTTP/1.1 server vs Go net/http — GET/POST throughput and latency
HTTP/2 ServerxKit single-threaded HTTP/2 (h2c) server vs Go net/http h2c — GET/POST throughput and latency
HTTPS ServerxKit single-threaded HTTPS (TLS 1.3) server vs Go net/http — GET/POST throughput and latency

Quick Navigation Guide

By Use Case

I want to...Start here
Build an event-driven serverxbase/event.hxbase/socket.h
Schedule timersxbase/timer.h
Run tasks on a thread poolxbase/task.h
Make async HTTP requestsxhttp/client.h
Stream LLM API responses (SSE)xhttp/sse.c
Build an HTTP serverxhttp/server.h
Add WebSocket serverxhttp/ws.h
Connect as WebSocket clientxhttp/ws.h
Parse a URLxnet/url.h
Resolve DNS asynchronouslyxnet/dns.h
Make async TCP connectionsxnet/tcp.h
Build a TCP serverxnet/tcp.h
Configure TLSxnet/tls.h
Enable TLS (HTTPS)xhttp/transport.h
Add async loggingxlog/logger.h
Manage object lifecyclesxbase/memory.h
Choose the right buffer typexbuf overview
Build a lock-free producer/consumer pipelinexbase/mpsc.h
See micro-benchmark resultsEach module doc has a Benchmark section (e.g. mpsc.h)
See HTTP server benchmarksHTTP/1.1 · HTTP/2 · HTTPS

By Dependency Level

Level 0 (no deps)     : atomic.h, error.h, time.h
Level 1 (atomic only) : heap.h, mpsc.h
Level 2 (Level 0-1)   : memory.h, log.h, backtrace.h, buf.h, ring.h
Level 3 (Level 0-2)   : event.h, io.h, url.h, tls.h
Level 4 (event loop)  : timer.h, task.h, socket.h, dns.h, tcp.h, logger.h, client.h, server.h, ws.h

Module Dependency Graph

graph BT
    subgraph "Level 0"
        ATOMIC["atomic.h"]
        ERROR["error.h"]
        TIME["time.h"]
    end

    subgraph "Level 1"
        HEAP["heap.h"]
        MPSC["mpsc.h"]
    end

    subgraph "Level 2"
        MEMORY["memory.h"]
        LOG["log.h"]
        BT_["backtrace.h"]
        BUF["buf.h"]
        RING["ring.h"]
    end

    subgraph "Level 3"
        EVENT["event.h"]
        IO["io.h"]
        URL["url.h"]
        TLS_CONF["tls.h"]
    end

    subgraph "Level 4"
        TIMER["timer.h"]
        TASK["task.h"]
        SOCKET["socket.h"]
        DNS["dns.h"]
        TCP["tcp.h"]
        LOGGER["logger.h"]
        CLIENT["client.h"]
        SERVER["server.h"]
        WS["ws.h"]
    end

    HEAP --> ATOMIC
    MPSC --> ATOMIC
    MEMORY --> ERROR
    LOG --> BT_
    IO --> ATOMIC
    IO --> BUF
    EVENT --> HEAP
    EVENT --> MPSC
    EVENT --> TIME
    TIMER --> EVENT
    TASK --> EVENT
    SOCKET --> EVENT
    DNS --> EVENT
    TCP --> EVENT
    TCP --> DNS
    TCP --> SOCKET
    TCP --> TLS_CONF
    LOGGER --> EVENT
    LOGGER --> MPSC
    LOGGER --> LOG
    CLIENT --> EVENT
    CLIENT --> BUF
    CLIENT --> URL
    CLIENT --> DNS
    CLIENT --> TLS_CONF
    SERVER --> SOCKET
    SERVER --> BUF
    SERVER --> TLS_CONF
    WS --> SERVER
    WS --> URL

    style EVENT fill:#50b86c,color:#fff
    style URL fill:#e74c3c,color:#fff
    style DNS fill:#e74c3c,color:#fff
    style TCP fill:#e74c3c,color:#fff
    style TLS_CONF fill:#e74c3c,color:#fff
    style CLIENT fill:#f5a623,color:#fff
    style SERVER fill:#f5a623,color:#fff
    style WS fill:#f5a623,color:#fff
    style LOGGER fill:#9b59b6,color:#fff

Build & Test

# Build
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build --parallel

# Test
ctest --test-dir build --output-on-failure --parallel 4

See the project README for full build instructions, prerequisites, and container-based Linux testing.

Benchmark

Micro-benchmark results are included in each module's documentation page (see the Benchmark section at the bottom of each page, e.g. mpsc.h, buf.h).

End-to-end benchmarks:

BenchmarkDescription
HTTP/1.1 ServerxKit vs Go net/http — 152K req/s single-threaded, +15~60% faster across all scenarios
HTTP/2 ServerxKit vs Go h2c — single-threaded HTTP/2 (h2c) throughput comparison
HTTPS ServerxKit vs Go HTTPS — single-threaded TLS 1.3 throughput comparison

License

MIT © 2025-present Leo X. and xKit contributors

Modules

xKit is organized into five modules, layered from low-level core primitives up to high-level async networking.

┌─────────────────────────────────────────────┐
│              Application Layer              │
├──────────────────────┬──────────────────────┤
│   xhttp              │   xlog               │
│   HTTP Client/Server │   Async Logging       │
│   WebSocket          │                      │
├──────────────────────┴──────────────────────┤
│   xnet — URL / DNS / TCP / TLS Config       │
├─────────────────────────────────────────────┤
│   xbuf — Linear / Ring / Block-Chain Buffer │
├─────────────────────────────────────────────┤
│   xbase — Event Loop / Timer / Task /       │
│           Memory / Atomic / MPSC Queue      │
└─────────────────────────────────────────────┘

Overview

ModuleDescription
xbaseCore primitives — event loop, timers, tasks, async sockets, memory, lock-free data structures
xbufBuffer primitives — linear, ring, and block-chain I/O buffers
xnetNetworking primitives — URL parser, async DNS resolver, TCP, shared TLS configuration types
xhttpAsync HTTP client & server — libcurl multi-socket client with SSE streaming, HTTP/1.1 & HTTP/2 async server with TLS, WebSocket server & client
xlogAsync logging — MPSC queue, timer/pipe flush, log rotation

Dependency Order

Level 0 (no deps)     : atomic.h, error.h, time.h
Level 1 (atomic only) : heap.h, mpsc.h
Level 2 (Level 0-1)   : memory.h, log.h, backtrace.h, buf.h, ring.h
Level 3 (Level 0-2)   : event.h, io.h, url.h, tls.h
Level 4 (event loop)  : timer.h, task.h, socket.h, dns.h, tcp.h, logger.h, client.h, server.h, ws.h

xbase — Event-Driven Async Foundation

Introduction

xbase is the foundational module of xKit, providing the core primitives for building event-driven, asynchronous C applications on macOS and Linux. It delivers a cross-platform event loop, monotonic timers, an N:M task model (thread pool), async sockets, reference-counted memory management, lock-free data structures, and essential utilities — all in a minimal, zero-dependency C99 package.

xbase is designed to be the "kernel" that higher-level xKit modules (xbuf, xhttp, xlog) build upon. Every I/O-bound or timer-driven feature in xKit ultimately relies on xbase's event loop and concurrency primitives.

Design Philosophy

  1. Edge-Triggered by Default — The event loop operates in edge-triggered mode across all backends (kqueue, epoll, poll), encouraging callers to drain file descriptors completely. This yields higher throughput and fewer spurious wakeups compared to level-triggered designs.

  2. Layered Abstraction — Low-level primitives (atomic, mpsc, heap) are composed into mid-level services (timer, task) which are then integrated into the high-level event loop. Each layer is independently usable.

  3. Zero Allocation in the Hot Path — Data structures like the MPSC queue and min-heap are designed to avoid dynamic allocation during normal operation. Memory is pre-allocated or embedded in user structs.

  4. Thread-Safety Where It Matters — APIs that are expected to be called cross-thread (e.g., xEventWake, xTimerSubmitAfter, xMpscPush) are explicitly designed to be thread-safe. Single-threaded APIs are documented as such.

  5. vtable-Driven Lifecycle — The memory module uses a virtual table pattern (ctor/dtor/retain/release) to provide reference-counted object management in pure C, inspired by Objective-C's retain/release model.

  6. Platform Adaptation at Build Time — Platform-specific code (kqueue vs. epoll, libunwind vs. execinfo) is selected via compile-time macros, keeping runtime overhead at zero.

Architecture

graph TD
    subgraph "High-Level Services"
        EVENT["event.h<br/>Event Loop"]
        TIMER["timer.h<br/>Monotonic Timer"]
        TASK["task.h<br/>N:M Task Model"]
        SOCKET["socket.h<br/>Async Socket"]
    end

    subgraph "Infrastructure"
        MEMORY["memory.h<br/>Ref-Counted Memory"]
        LOG["log.h<br/>Thread-Local Log"]
        BACKTRACE["backtrace.h<br/>Stack Backtrace"]
        ERROR["error.h<br/>Error Codes"]
        TIME["time.h<br/>Time Utilities"]
    end

    subgraph "Data Structures & Concurrency"
        HEAP["heap.h<br/>Min-Heap"]
        MPSC["mpsc.h<br/>Lock-Free MPSC Queue"]
        ATOMIC["atomic.h<br/>Atomic Operations"]
    end

    EVENT -->|"registers timers"| TIMER
    EVENT -->|"offloads work"| TASK
    EVENT -->|"wraps fd"| SOCKET
    SOCKET -->|"monitors I/O"| EVENT
    SOCKET -->|"idle timeout"| EVENT

    TIMER -->|"schedules entries"| HEAP
    TIMER -->|"poll-mode queue"| MPSC
    TIMER -->|"push-mode dispatch"| TASK
    TIMER -->|"reads clock"| TIME

    MPSC -->|"CAS operations"| ATOMIC
    MEMORY -->|"atomic refcount"| ATOMIC

    LOG -->|"fatal backtrace"| BACKTRACE
    LOG -->|"error formatting"| ERROR

    EVENT -->|"reads clock"| TIME

    style EVENT fill:#4a90d9,color:#fff
    style TIMER fill:#4a90d9,color:#fff
    style TASK fill:#4a90d9,color:#fff
    style SOCKET fill:#4a90d9,color:#fff
    style MEMORY fill:#50b86c,color:#fff
    style LOG fill:#50b86c,color:#fff
    style BACKTRACE fill:#50b86c,color:#fff
    style ERROR fill:#50b86c,color:#fff
    style TIME fill:#50b86c,color:#fff
    style HEAP fill:#f5a623,color:#fff
    style MPSC fill:#f5a623,color:#fff
    style ATOMIC fill:#f5a623,color:#fff

Sub-Module Overview

HeaderDocumentDescription
event.hevent.mdCross-platform event loop (edge-triggered) — kqueue / epoll / poll backends with built-in timer and thread-pool integration
timer.htimer.mdMonotonic timer with push (thread-pool) and poll (lock-free MPSC) fire modes
task.htask.mdN:M task model — lightweight tasks multiplexed onto a configurable thread pool
socket.hsocket.mdAsync socket abstraction with idle-timeout support over xEventLoop
memory.hmemory.mdReference-counted allocation with vtable-driven lifecycle (ctor/dtor/retain/release)
log.hlog.mdPer-thread callback-based logging with optional backtrace on fatal
backtrace.hbacktrace.mdPlatform-adaptive stack trace capture (libunwind > execinfo > stub)
error.herror.mdUnified error codes (xErrno) and human-readable messages
heap.hheap.mdGeneric min-heap with O(log n) insert/remove, used internally by the timer subsystem
mpsc.hmpsc.mdLock-free multi-producer / single-consumer intrusive queue
atomic.hatomic.mdCompiler-portable atomic operations (GCC/Clang __atomic builtins)
io.hio.mdAbstract I/O interfaces (Reader, Writer, Seeker, Closer) with convenience helpers (xReadFull, xReadAll, xWritev, etc.)
time.hTime utilities: xMonoMs() (monotonic) and xWallMs() (wall-clock) in milliseconds

How to Choose

I need to…Use
React to I/O readiness on file descriptorsevent.h — register fds and get edge-triggered callbacks
Schedule delayed or periodic worktimer.h — standalone timer, or use xEventLoopTimerAfter() for event-loop-integrated timers
Run CPU-bound work off the main threadtask.h — submit to a thread pool, optionally collect results
Manage non-blocking TCP/UDP connectionssocket.h — wraps socket + event loop + idle timeout
Allocate objects with automatic cleanupmemory.hXMALLOC(T) + xRetain/xRelease
Report errors from library internalslog.h — thread-local callback, or stderr fallback
Capture a stack trace for debuggingbacktrace.hxBacktrace() fills a buffer
Handle error codes uniformlyerror.hxErrno enum + xstrerror()
Build a priority queueheap.h — generic min-heap with index tracking
Pass messages between threads lock-freempsc.h — intrusive MPSC queue
Perform atomic read-modify-writeatomic.h — macro wrappers over compiler builtins
Get current time in millisecondstime.hxMonoMs() for elapsed time, xWallMs() for wall-clock
Read/write through abstract I/O interfacesio.hxReader / xWriter + helpers like xReadFull, xReadAll

Quick Start

A minimal example that creates an event loop, schedules a one-shot timer, and runs until the timer fires:

#include <stdio.h>
#include <xbase/event.h>

static void on_timer(void *arg) {
    printf("Timer fired!\n");
    xEventLoopStop((xEventLoop)arg);
}

int main(void) {
    // Create an event loop
    xEventLoop loop = xEventLoopCreate();
    if (!loop) return 1;

    // Schedule a timer to fire after 1 second
    xEventLoopTimerAfter(loop, on_timer, loop, 1000);

    // Run the event loop (blocks until xEventLoopStop is called)
    xEventLoopRun(loop);

    // Clean up
    xEventLoopDestroy(loop);
    return 0;
}

Compile with:

gcc -o example example.c -I/path/to/xkit -lxbase -lpthread

Relationship with Other Modules

graph LR
    XBASE["xbase"]
    XBUF["xbuf"]
    XHTTP["xhttp"]
    XLOG["xlog"]

    XHTTP -->|"event loop + timer"| XBASE
    XHTTP -->|"I/O buffers"| XBUF
    XLOG -->|"event loop + MPSC queue"| XBASE
    XBUF -.->|"no dependency"| XBASE
    XNET["xnet"]
    XNET -->|"event loop + thread pool + atomic"| XBASE
    XHTTP -->|"URL + DNS + TLS config"| XNET

    style XBASE fill:#4a90d9,color:#fff
    style XBUF fill:#50b86c,color:#fff
    style XHTTP fill:#f5a623,color:#fff
    style XLOG fill:#e74c3c,color:#fff
    style XNET fill:#e74c3c,color:#fff
  • xbuf — Buffer module. xIOBuffer uses xbase's atomic.h for lock-free block pool management. xhttp uses both xbase and xbuf together.
  • xhttp — The async HTTP client is built on top of xbase's event loop (xEventLoop) and timer infrastructure, and uses xbuf for response buffering.
  • xnet — The networking primitives module. The async DNS resolver uses xbase's event loop for thread-pool offload (xEventLoopSubmit) and atomic.h for the cancellation flag.
  • xlog — The async logger uses xbase's event loop for timer-based flushing and the MPSC queue for lock-free log message passing from application threads to the logger thread.

event.h — Cross-Platform Event Loop

Introduction

event.h provides a cross-platform, edge-triggered event loop abstraction for I/O multiplexing. It unifies three OS-specific backends — kqueue (macOS/BSD), epoll (Linux), and poll (POSIX fallback) — behind a single API. The event loop is the central coordination point in xbase: it monitors file descriptors for readiness, dispatches timer callbacks, offloads CPU-bound work to thread pools, and watches for POSIX signals — all from a single thread.

Design Philosophy

  1. Edge-Triggered Everywhere — All three backends operate in edge-triggered mode. kqueue uses EV_CLEAR, epoll uses EPOLLET, and poll emulates edge-triggered behavior by clearing the event mask after each notification (requiring the caller to re-arm via xEventMod()). This design encourages callers to drain fds completely, reducing spurious wakeups.

  2. Backend Selection at Compile Time — The backend is chosen via preprocessor macros (XK_HAS_KQUEUE, XK_HAS_EPOLL), with poll as the universal fallback. This means zero runtime dispatch overhead.

  3. Integrated Timer Heap — Rather than requiring a separate timer facility, the event loop embeds a min-heap of timer entries. xEventWait() automatically adjusts its timeout to fire the earliest timer, providing sub-millisecond timer resolution without a dedicated timer thread.

  4. Thread-Pool OffloadxEventLoopSubmit() bridges the event loop and the task system: CPU-bound work runs on a worker thread, and the completion callback is dispatched on the event loop thread via a lock-free MPSC queue + wake pipe, ensuring single-threaded callback semantics.

  5. Self-Pipe Trick for Signals — On epoll and poll backends, signal delivery uses the self-pipe trick (a sigaction handler writes to a pipe) rather than signalfd, avoiding the fragile requirement of blocking signals in every thread. On kqueue, EVFILT_SIGNAL is used natively.

Architecture

graph TD
    subgraph "Event Loop (single thread)"
        WAIT["xEventWait()"]
        DISPATCH["Dispatch I/O callbacks"]
        TIMERS["Fire expired timers"]
        DONE["Drain done-queue"]
        SWEEP["Sweep deleted sources"]
    end

    subgraph "Backend (compile-time)"
        KQ["kqueue"]
        EP["epoll"]
        PO["poll"]
    end

    subgraph "Cross-Thread"
        WAKE["Wake Pipe"]
        MPSC_Q["MPSC Done Queue"]
        WORKER["Worker Thread Pool"]
    end

    WAIT --> KQ
    WAIT --> EP
    WAIT --> PO
    KQ --> DISPATCH
    EP --> DISPATCH
    PO --> DISPATCH
    DISPATCH --> TIMERS
    TIMERS --> DONE
    DONE --> SWEEP

    WORKER -->|"push result"| MPSC_Q
    MPSC_Q -->|"wake"| WAKE
    WAKE -->|"drain"| DONE

    style WAIT fill:#4a90d9,color:#fff
    style DISPATCH fill:#4a90d9,color:#fff
    style TIMERS fill:#f5a623,color:#fff
    style DONE fill:#50b86c,color:#fff

Event Loop Lifecycle

sequenceDiagram
    participant App
    participant EL as xEventLoop
    participant Backend as kqueue / epoll / poll
    participant Timer as Timer Heap

    App->>EL: xEventLoopCreate()
    App->>EL: xEventAdd(fd, mask, callback)
    App->>EL: xEventLoopTimerAfter(fn, 1000ms)
    App->>EL: xEventLoopRun()

    loop Main Loop
        EL->>Timer: Check earliest deadline
        Timer-->>EL: timeout = min(user_timeout, timer_deadline)
        EL->>Backend: wait(timeout)
        Backend-->>EL: ready events
        EL->>App: callback(fd, mask)
        EL->>Timer: Pop & fire expired timers
        EL->>EL: Sweep deleted sources
    end

    App->>EL: xEventLoopStop()
    App->>EL: xEventLoopDestroy()

Implementation Details

Backend Architecture

Each backend is implemented in a separate .c file that provides the full public API:

FileBackendTrigger ModeSelection
event_kqueue.ckqueueEV_CLEAR (native edge)#ifdef XK_HAS_KQUEUE
event_epoll.cepollEPOLLET (native edge)#ifdef XK_HAS_EPOLL
event_poll.cpoll(2)Emulated edge (mask cleared after dispatch)Fallback

All backends share a common base structure (struct xEventLoop_) defined in event_private.h, which contains:

  • A dynamic source array with deferred deletion (sweep after dispatch)
  • A wake pipe (non-blocking) for cross-thread wakeup
  • A min-heap for builtin timers (protected by timer_mu mutex)
  • A lock-free MPSC done-queue for offload completion callbacks
  • Signal watch slots (up to XK_SIGNAL_MAX = 64)

Deferred Source Deletion

When xEventDel() is called during a callback dispatch, the source is marked deleted = 1 rather than freed immediately. After the dispatch batch completes, source_array_sweep() frees all deleted sources. This prevents use-after-free when multiple events reference the same source in a single xEventWait() call.

Wake Pipe

A non-blocking pipe (wake_rfd / wake_wfd) is registered with the backend. xEventWake() writes a single byte to the write end; the event loop drains the read end and processes the done-queue. Multiple wakes before the next xEventWait() are coalesced (EAGAIN on a full pipe is treated as success).

Timer Integration

Builtin timers are stored in a min-heap inside the event loop. Before each xEventWait() call, the effective timeout is clamped to the earliest timer deadline. After I/O dispatch, expired timers are popped and fired. Timer operations (xEventLoopTimerAfter, xEventLoopTimerAt, xEventLoopTimerCancel) are thread-safe, protected by timer_mu.

Signal Handling

BackendMechanism
kqueueEVFILT_SIGNAL with EV_CLEAR — native kernel support
epollSelf-pipe trick: sigaction handler writes to a per-signal pipe
pollSelf-pipe trick: same as epoll

The self-pipe approach avoids signalfd's requirement to block signals in all threads, which is fragile in the presence of third-party libraries and test frameworks.

API Reference

Types

TypeDescription
xEventMaskBitmask enum: xEvent_Read (1), xEvent_Write (2), xEvent_Timeout (4)
xEventFuncvoid (*)(int fd, xEventMask mask, void *arg) — I/O callback
xEventTimerFuncvoid (*)(void *arg) — Timer callback
xEventSignalFuncvoid (*)(int signo, void *arg) — Signal callback
xEventDoneFuncvoid (*)(void *arg, void *result) — Offload completion callback
xEventLoopOpaque handle to an event loop
xEventSourceOpaque handle to a registered event source
xEventTimerOpaque handle to a builtin timer

Functions

Lifecycle

FunctionSignatureThread Safety
xEventLoopCreatexEventLoop xEventLoopCreate(void)Not thread-safe
xEventLoopCreateWithGroupxEventLoop xEventLoopCreateWithGroup(xTaskGroup group)Not thread-safe
xEventLoopDestroyvoid xEventLoopDestroy(xEventLoop loop)Not thread-safe
xEventLoopRunvoid xEventLoopRun(xEventLoop loop)Not thread-safe (call from one thread)
xEventLoopStopvoid xEventLoopStop(xEventLoop loop)Thread-safe

I/O Sources

FunctionSignatureThread Safety
xEventAddxEventSource xEventAdd(xEventLoop loop, int fd, xEventMask mask, xEventFunc fn, void *arg)Not thread-safe
xEventModxErrno xEventMod(xEventLoop loop, xEventSource src, xEventMask mask)Not thread-safe
xEventDelxErrno xEventDel(xEventLoop loop, xEventSource src)Not thread-safe
xEventWaitint xEventWait(xEventLoop loop, int timeout_ms)Not thread-safe

Timers

FunctionSignatureThread Safety
xEventLoopTimerAfterxEventTimer xEventLoopTimerAfter(xEventLoop loop, xEventTimerFunc fn, void *arg, uint64_t delay_ms)Thread-safe
xEventLoopTimerAtxEventTimer xEventLoopTimerAt(xEventLoop loop, xEventTimerFunc fn, void *arg, uint64_t abs_ms)Thread-safe
xEventLoopTimerCancelxErrno xEventLoopTimerCancel(xEventLoop loop, xEventTimer timer)Thread-safe

Cross-Thread

FunctionSignatureThread Safety
xEventWakexErrno xEventWake(xEventLoop loop)Thread-safe (signal-handler-safe)
xEventLoopSubmitxErrno xEventLoopSubmit(xEventLoop loop, xTaskGroup group, xTaskFunc work_fn, xEventDoneFunc done_fn, void *arg)Thread-safe

Signal

FunctionSignatureThread Safety
xEventLoopSignalWatchxErrno xEventLoopSignalWatch(xEventLoop loop, int signo, xEventSignalFunc fn, void *arg)Not thread-safe

Deprecated

FunctionSignatureReplacement
xEventLoopNowMsuint64_t xEventLoopNowMs(void)xMonoMs() from <xbase/time.h>

Usage Examples

Basic Event Loop with Timer

#include <stdio.h>
#include <xbase/event.h>

static void on_timer(void *arg) {
    printf("Timer fired!\n");
    xEventLoopStop((xEventLoop)arg);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    if (!loop) return 1;

    // Fire after 500ms
    xEventLoopTimerAfter(loop, on_timer, loop, 500);

    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

Monitoring a File Descriptor

#include <stdio.h>
#include <unistd.h>
#include <xbase/event.h>

static void on_readable(int fd, xEventMask mask, void *arg) {
    char buf[1024];
    ssize_t n;
    // Edge-triggered: must drain completely
    while ((n = read(fd, buf, sizeof(buf))) > 0) {
        fwrite(buf, 1, (size_t)n, stdout);
    }
    (void)mask;
    (void)arg;
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    // Monitor stdin for readability
    xEventAdd(loop, STDIN_FILENO, xEvent_Read, on_readable, NULL);

    // Run for up to 10 seconds
    xEventLoopTimerAfter(loop, (xEventTimerFunc)xEventLoopStop, loop, 10000);
    xEventLoopRun(loop);

    xEventLoopDestroy(loop);
    return 0;
}

Offloading Work to a Thread Pool

#include <stdio.h>
#include <xbase/event.h>

static void *heavy_work(void *arg) {
    // Runs on a worker thread
    int *val = (int *)arg;
    *val *= 2;
    return val;
}

static void on_done(void *arg, void *result) {
    // Runs on the event loop thread
    int *val = (int *)result;
    printf("Result: %d\n", *val);
    (void)arg;
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    int value = 21;

    xEventLoopSubmit(loop, NULL, heavy_work, on_done, &value);

    // Run briefly to process the completion
    xEventLoopTimerAfter(loop, (xEventTimerFunc)xEventLoopStop, loop, 1000);
    xEventLoopRun(loop);

    xEventLoopDestroy(loop);
    return 0;
}

Use Cases

  1. Network Servers — Register listening sockets and accepted connections with the event loop. Use edge-triggered callbacks to read/write data without blocking. Combine with xSocket for idle-timeout support.

  2. Timer-Driven State Machines — Use xEventLoopTimerAfter() to schedule state transitions, retries, or heartbeat checks. The timer is integrated into the event loop, so no separate timer thread is needed.

  3. Hybrid I/O + CPU Workloads — Use xEventLoopSubmit() to offload CPU-intensive parsing or compression to a thread pool, then process results on the event loop thread where I/O state is safely accessible.

Best Practices

  • Always drain fds in edge-triggered mode. Read/write until EAGAIN in every callback. Missing data means you won't be notified again until new data arrives.
  • Never block in callbacks. The event loop is single-threaded; a blocking call stalls all I/O and timer processing. Offload heavy work via xEventLoopSubmit().
  • Use xEventLoopRun() for the main loop. It handles timer dispatch and stop-flag checking automatically. Only use xEventWait() directly if you need custom loop logic.
  • Cancel timers you no longer need. Uncancelled timers hold memory until they fire. Use xEventLoopTimerCancel() to free them early.
  • Be aware of the poll backend's edge emulation. On systems without kqueue or epoll, the poll backend clears the event mask after dispatch. You must call xEventMod() to re-arm.

Comparison with Other Libraries

Featurexbase event.hlibeventlibevlibuv
Trigger ModeEdge-triggered onlyLevel (default), edge optionalLevel + edgeLevel-triggered
Backendskqueue, epoll, pollkqueue, epoll, poll, select, devpoll, IOCPkqueue, epoll, poll, select, portkqueue, epoll, poll, IOCP
Timer IntegrationBuilt-in min-heapSeparate timer APIBuilt-inBuilt-in
Thread PoolBuilt-in (xEventLoopSubmit)None (external)None (external)Built-in (uv_queue_work)
Signal HandlingSelf-pipe / EVFILT_SIGNALevsignalev_signaluv_signal
API StyleOpaque handles, C99Struct-based, C89Struct-based, C89Handle-based, C99
Binary Size~15 KB~200 KB~50 KB~500 KB
DependenciesNoneNoneNoneNone
Windows SupportNot yetYes (IOCP)Yes (select)Yes (IOCP)
Design GoalMinimal building blockFull-featured frameworkMinimal + performantCross-platform framework

Key Differentiator: xbase's event loop is intentionally minimal — it provides the essential primitives (I/O, timers, signals, thread-pool offload) without buffered I/O, DNS resolution, or HTTP parsing. This makes it ideal as a foundation layer for higher-level libraries (like xhttp) rather than a standalone application framework.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2), kqueue backend. Source: xbase/event_bench.cpp

BenchmarkTime (ns)CPU (ns)Iterations
BM_EventLoop_CreateDestroy2,6632,663264,113
BM_EventLoop_WakeLatency854854814,901
BM_EventLoop_PipeAddDel1,1071,107627,088

Key Observations:

  • Create/Destroy takes ~2.7µs, reflecting the cost of kqueue fd creation and internal structure allocation. Acceptable for long-lived event loops.
  • Wake latency is ~854ns per wake+wait cycle, demonstrating efficient cross-thread notification via the internal wake mechanism.
  • Add/Del cycle (register + unregister a pipe fd) takes ~1.1µs, showing low overhead for dynamic fd management — important for short-lived connections.

timer.h — Monotonic Timer

Introduction

timer.h provides a standalone monotonic timer that schedules callbacks to fire after a delay or at an absolute time. It supports two fire modes — Push mode (dispatch to a thread pool) and Poll mode (enqueue to a lock-free MPSC queue for caller-driven execution) — making it suitable for both multi-threaded and single-threaded architectures.

Note: For timers integrated directly into an event loop, see xEventLoopTimerAfter() / xEventLoopTimerAt() in event.h. The standalone timer.h is useful when you need timers without an event loop, or when you want explicit control over which thread executes the callbacks.

Design Philosophy

  1. Dual Fire Modes — Push mode hands expired callbacks to a thread pool for concurrent execution; Poll mode queues them for the caller to drain synchronously. This lets latency-sensitive code (e.g., an event loop) avoid thread-switch overhead by polling, while background services can use push mode for simplicity.

  2. Dedicated Timer Thread — Each xTimer instance spawns one background thread that sleeps on a condition variable, waking only when the earliest deadline arrives or a new entry is submitted. This avoids busy-waiting and keeps CPU usage near zero when idle.

  3. Min-Heap for O(log n) Scheduling — Timer entries are stored in a min-heap ordered by deadline. Insert, cancel, and fire-next are all O(log n). The heap is provided by heap.h.

  4. Lock-Free Poll Queue — In poll mode, expired entries are pushed onto an intrusive MPSC queue (mpsc.h) without holding the mutex, minimizing contention between the timer thread and the polling thread.

Architecture

sequenceDiagram
    participant App
    participant Timer as xTimer
    participant Thread as Timer Thread
    participant Heap as Min-Heap
    participant Queue as MPSC Queue

    App->>Timer: xTimerCreate(group)
    Timer->>Thread: spawn

    App->>Timer: xTimerSubmitAfter(fn, 1000ms)
    Timer->>Heap: push(entry)
    Timer->>Thread: signal(cond)

    Thread->>Heap: peek → deadline
    Note over Thread: sleep until deadline

    Thread->>Heap: pop(entry)
    alt Push Mode
        Thread->>App: xTaskSubmit(fn)
    else Poll Mode
        Thread->>Queue: xMpscPush(entry)
        App->>Queue: xTimerPoll()
        Queue-->>App: callback(arg)
    end

Implementation Details

Internal Structure

struct xTimerTask_ {
    xMpsc        node;       // Intrusive MPSC node (poll mode)
    uint64_t     deadline;   // Absolute expiry time (CLOCK_MONOTONIC, ms)
    xTimerFunc   fn;         // User callback
    void        *arg;        // User argument
    size_t       heap_idx;   // Position in min-heap (TIMER_INVALID_IDX when not in heap)
    int          cancelled;  // Set to 1 under mutex before removal
};

struct xTimer_ {
    xHeap            heap;      // Min-heap ordered by deadline
    xTaskGroup       group;     // Non-NULL → push mode; NULL → poll mode
    xMpsc           *mq_head;   // Poll-mode MPSC queue head
    xMpsc           *mq_tail;   // Poll-mode MPSC queue tail
    pthread_t        thread;    // Background timer thread
    pthread_mutex_t  mu;        // Protects heap and stopped flag
    pthread_cond_t   cond;      // Wakes timer thread on new entry or stop
    int              stopped;   // Shutdown flag
};

Timer Thread Loop

The background thread follows this algorithm:

  1. Wait — If the heap is empty, block on pthread_cond_wait().
  2. Check top — Peek at the minimum-deadline entry.
  3. Fire or sleep — If deadline ≤ now, pop and fire. Otherwise, pthread_cond_timedwait() until the deadline or a new signal.
  4. Repeat until stopped is set.

When a new entry is submitted, pthread_cond_signal() wakes the thread so it can re-evaluate whether the new entry has an earlier deadline.

Push vs. Poll Mode

graph LR
    subgraph "Push Mode (group != NULL)"
        HEAP_P["Min-Heap"] -->|"pop expired"| FIRE_P["fire()"]
        FIRE_P -->|"xTaskSubmit"| POOL["Thread Pool"]
        POOL -->|"execute"| CB_P["callback(arg)"]
    end

    subgraph "Poll Mode (group == NULL)"
        HEAP_Q["Min-Heap"] -->|"pop expired"| FIRE_Q["fire()"]
        FIRE_Q -->|"xMpscPush"| MPSC["MPSC Queue"]
        MPSC -->|"xTimerPoll()"| CB_Q["callback(arg)"]
    end

    style POOL fill:#4a90d9,color:#fff
    style MPSC fill:#f5a623,color:#fff

Cancellation

xTimerCancel() acquires the mutex, checks if the entry is still in the heap (not already fired or cancelled), removes it via xHeapRemove(), marks it cancelled, and frees the memory. If the entry has already fired, xErrno_Cancelled is returned.

Memory Ownership

  • Push mode: The timer thread transfers ownership of the xTimerTask_ to the worker thread via xTaskSubmit(). The worker frees it after executing the callback.
  • Poll mode: The timer thread pushes the entry to the MPSC queue. xTimerPoll() pops and frees each entry after executing its callback.
  • Cancellation: The caller frees the entry immediately.
  • Destroy: Remaining heap entries and poll-queue entries are freed without firing.

API Reference

Types

TypeDescription
xTimerFuncvoid (*)(void *arg) — Timer callback signature
xTimerOpaque handle to a timer instance
xTimerTaskOpaque handle to a submitted timer entry

Functions

FunctionSignatureDescriptionThread Safety
xTimerCreatexTimer xTimerCreate(xTaskGroup g)Create a timer. g != NULL → push mode, g == NULL → poll mode.Not thread-safe
xTimerDestroyvoid xTimerDestroy(xTimer t)Stop the timer thread and free all resources. Pending entries are discarded.Not thread-safe
xTimerSubmitAfterxTimerTask xTimerSubmitAfter(xTimer t, xTimerFunc fn, void *arg, uint64_t delay_ms)Schedule a callback after a relative delay.Thread-safe
xTimerSubmitAtxTimerTask xTimerSubmitAt(xTimer t, xTimerFunc fn, void *arg, uint64_t abs_ms)Schedule a callback at an absolute monotonic time.Thread-safe
xTimerCancelxErrno xTimerCancel(xTimer t, xTimerTask task)Cancel a pending entry. Returns xErrno_Ok if cancelled, xErrno_Cancelled if already fired.Thread-safe
xTimerPollint xTimerPoll(xTimer t)Execute all due callbacks (poll mode only). Returns count. No-op in push mode.Not thread-safe
xTimerNowMsuint64_t xTimerNowMs(void)Deprecated. Use xMonoMs() from <xbase/time.h>.Thread-safe

Usage Examples

Push Mode (Thread Pool Dispatch)

#include <stdio.h>
#include <xbase/timer.h>
#include <xbase/task.h>
#include <unistd.h>

static void on_timeout(void *arg) {
    printf("Timer fired on worker thread! arg=%p\n", arg);
}

int main(void) {
    xTaskGroup group = xTaskGroupCreate(NULL);
    xTimer timer = xTimerCreate(group);

    // Fire after 500ms on a worker thread
    xTimerSubmitAfter(timer, on_timeout, NULL, 500);

    sleep(1); // Wait for timer to fire

    xTimerDestroy(timer);
    xTaskGroupDestroy(group);
    return 0;
}

Poll Mode (Event Loop Integration)

#include <stdio.h>
#include <xbase/timer.h>
#include <xbase/time.h>

static void on_timeout(void *arg) {
    int *count = (int *)arg;
    printf("Timer #%d fired on caller thread\n", ++(*count));
}

int main(void) {
    xTimer timer = xTimerCreate(NULL); // Poll mode
    int count = 0;

    // Schedule 3 timers
    xTimerSubmitAfter(timer, on_timeout, &count, 100);
    xTimerSubmitAfter(timer, on_timeout, &count, 200);
    xTimerSubmitAfter(timer, on_timeout, &count, 300);

    // Poll loop
    uint64_t start = xMonoMs();
    while (xMonoMs() - start < 500) {
        int n = xTimerPoll(timer);
        if (n > 0) printf("  Polled %d timer(s)\n", n);
        usleep(10000); // 10ms
    }

    xTimerDestroy(timer);
    return 0;
}

Use Cases

  1. Event Loop Timer Backend — The event loop's builtin timers (xEventLoopTimerAfter) use the same min-heap approach internally. Use standalone xTimer when you need timers independent of an event loop.

  2. Retry / Backoff Logic — Schedule retries with exponential backoff using xTimerSubmitAfter(). Cancel pending retries with xTimerCancel() when a response arrives.

  3. Periodic Health Checks — In poll mode, integrate xTimerPoll() into your main loop to execute periodic health checks without spawning additional threads.

Best Practices

  • Choose the right mode. Use push mode when callbacks are independent and can run concurrently. Use poll mode when callbacks must run on a specific thread (e.g., the event loop thread) or when you want to avoid thread-switch latency.
  • Don't use the handle after fire or cancel. Once a timer entry fires or is cancelled, the memory is freed. Accessing the handle is undefined behavior.
  • Destroy before the task group. If using push mode, destroy the timer before destroying the task group to ensure all in-flight callbacks complete.
  • Prefer xEventLoopTimerAfter() when using an event loop. It avoids the overhead of a separate timer thread and integrates seamlessly with I/O dispatch.

Comparison with Other Libraries

Featurexbase timer.htimerfd (Linux)POSIX timer (timer_create)libuv uv_timer
PlatformmacOS + LinuxLinux onlyPOSIX (varies)Cross-platform
Fire ModePush (thread pool) or Poll (MPSC)fd-based (integrates with epoll)Signal or threadEvent loop callback
ResolutionMillisecond (CLOCK_MONOTONIC)NanosecondNanosecondMillisecond
Data StructureMin-heap (O(log n))Kernel-managedKernel-managedMin-heap
Thread SafetySubmit/Cancel are thread-safefd operations are thread-safeVariesNot thread-safe
CancellationO(log n) via heap indextimerfd_settime(0)timer_delete()uv_timer_stop()
Overhead1 background thread per xTimer1 fd per timer1 kernel timer per instanceShared with event loop
Dependenciesheap.h, mpsc.h, task.hLinux kernelPOSIX RT librarylibuv

Key Differentiator: xbase's timer provides a unique dual-mode design (push/poll) that lets you choose between concurrent execution and single-threaded polling without changing your callback code. The poll mode's lock-free MPSC queue makes it ideal for integration with custom event loops.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbase/timer_bench.cpp

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Timer_SubmitCancel149121
BM_Timer_SubmitBatch101,8111,6875.9 M items/s
BM_Timer_SubmitBatch10011,4749,40610.6 M items/s
BM_Timer_SubmitBatch1,000110,11286,69911.5 M items/s
BM_Timer_FirePoll103,3953,3942.9 M items/s
BM_Timer_FirePoll10016,89715,5346.4 M items/s
BM_Timer_FirePoll1,000120,411101,1909.9 M items/s

Key Observations:

  • Submit+Cancel cycle takes ~121ns CPU time, reflecting the cost of one heap push + one heap remove. Fast enough for high-frequency timer management.
  • Batch submit throughput improves with batch size (5.9M → 11.5M items/s), showing good amortization of per-operation overhead.
  • Fire+Poll is slower than submit alone because it includes the MPSC queue transfer and callback invocation. At N=1000, it still achieves ~10M timer fires/s.

task.h — N:M Task Model

Introduction

task.h provides a lightweight N:M concurrent task model where N user tasks are multiplexed onto M OS threads managed by a task group (thread pool). It supports lazy thread creation, configurable queue capacity, per-task result retrieval, and a global shared task group for convenience.

Design Philosophy

  1. Lazy Thread Spawning — Worker threads are created on-demand when tasks are submitted and no idle thread is available, up to the configured maximum. This avoids pre-allocating threads that may never be used, reducing resource consumption for bursty workloads.

  2. Simple Submit/Wait Model — Tasks are submitted with xTaskSubmit() and optionally awaited with xTaskWait(). This mirrors the future/promise pattern found in higher-level languages, but in pure C with minimal overhead.

  3. Configurable Capacity — The task group can be configured with a maximum thread count and queue capacity. When the queue is full, xTaskSubmit() returns NULL, giving the caller explicit backpressure.

  4. Global Shared GroupxTaskGroupGlobal() provides a lazily-initialized, process-wide task group with default settings (unlimited threads, no queue cap). It's automatically destroyed at atexit(), making it convenient for fire-and-forget usage.

Architecture

graph TD
    subgraph "Task Group"
        QUEUE["Task Queue (FIFO)"]
        W1["Worker Thread 1"]
        W2["Worker Thread 2"]
        WN["Worker Thread N"]
    end

    APP["Application"] -->|"xTaskSubmit()"| QUEUE
    QUEUE -->|"dequeue"| W1
    QUEUE -->|"dequeue"| W2
    QUEUE -->|"dequeue"| WN

    W1 -->|"done"| RESULT["xTaskWait() → result"]
    W2 -->|"done"| RESULT
    WN -->|"done"| RESULT

    style APP fill:#4a90d9,color:#fff
    style QUEUE fill:#f5a623,color:#fff
    style RESULT fill:#50b86c,color:#fff

Implementation Details

Internal Structure

struct xTask_ {
    xTaskFunc       fn;       // User function
    void           *arg;      // User argument
    pthread_mutex_t lock;     // Protects done/result
    pthread_cond_t  cond;     // Signals completion
    bool            done;     // Completion flag
    void           *result;   // Return value of fn
    struct xTask_  *next;     // Intrusive queue linkage
};

struct xTaskGroup_ {
    pthread_t      *workers;      // Dynamic array of worker threads
    size_t          max_threads;  // Upper bound (SIZE_MAX if unlimited)
    size_t          nthreads;     // Currently spawned threads
    pthread_mutex_t qlock;        // Protects the task queue
    pthread_cond_t  qcond;        // Wakes idle workers
    struct xTask_  *qhead, *qtail; // FIFO task queue
    size_t          qsize, qcap;  // Current size and capacity
    size_t          idle;         // Number of idle workers
    atomic_size_t   pending;      // Submitted - finished
    atomic_size_t   done_count;   // Tasks completed
    pthread_cond_t  wcond;        // Dedicated cond for xTaskGroupWait()
    bool            shutdown;     // Shutdown flag
};

Worker Loop

Each worker thread runs worker_loop():

  1. Acquire lock and increment idle count.
  2. Wait on qcond while the queue is empty and not shutting down.
  3. Dequeue one task, decrement idle.
  4. Execute task->fn(task->arg).
  5. Signal completion via pthread_cond_broadcast(&task->cond).
  6. Update counters — decrement pending, signal wcond if all tasks are done.

Task Submission Flow

flowchart TD
    SUBMIT["xTaskSubmit(group, fn, arg)"]
    CHECK_CAP{"Queue full?"}
    ENQUEUE["Enqueue task"]
    CHECK_IDLE{"Idle workers > 0?"}
    SIGNAL["Signal qcond"]
    CHECK_MAX{"nthreads < max?"}
    SPAWN["Spawn new worker"]
    DONE["Return task handle"]
    FAIL["Return NULL"]

    SUBMIT --> CHECK_CAP
    CHECK_CAP -->|Yes| FAIL
    CHECK_CAP -->|No| ENQUEUE
    ENQUEUE --> CHECK_IDLE
    CHECK_IDLE -->|Yes| SIGNAL
    CHECK_IDLE -->|No| CHECK_MAX
    CHECK_MAX -->|Yes| SPAWN
    CHECK_MAX -->|No| DONE
    SPAWN --> SIGNAL
    SIGNAL --> DONE

    style SUBMIT fill:#4a90d9,color:#fff
    style FAIL fill:#e74c3c,color:#fff
    style DONE fill:#50b86c,color:#fff

Separate Wait Conditions

The implementation uses two separate condition variables:

  • qcond — Wakes idle workers when a new task arrives.
  • wcond — Wakes xTaskGroupWait() callers when all tasks complete.

Using a single condition variable caused lost wakeups: pthread_cond_signal() could wake an idle worker instead of the GroupWait caller, leaving it blocked forever.

Global Task Group

xTaskGroupGlobal() uses pthread_once for thread-safe lazy initialization. The group is registered with atexit() for automatic cleanup. It uses default configuration (unlimited threads, no queue cap).

API Reference

Types

TypeDescription
xTaskFuncvoid *(*)(void *arg) — Task function signature. Returns a result pointer.
xTaskOpaque handle to a submitted task
xTaskGroupOpaque handle to a task group (thread pool)
xTaskGroupConfConfiguration struct: nthreads (0 = auto), queue_cap (0 = unbounded)

Functions

FunctionSignatureDescriptionThread Safety
xTaskGroupCreatexTaskGroup xTaskGroupCreate(const xTaskGroupConf *conf)Create a task group. NULL conf = defaults.Not thread-safe
xTaskGroupDestroyvoid xTaskGroupDestroy(xTaskGroup g)Wait for pending tasks, then destroy.Not thread-safe
xTaskSubmitxTask xTaskSubmit(xTaskGroup g, xTaskFunc fn, void *arg)Submit a task. Returns NULL if queue is full.Thread-safe
xTaskWaitxErrno xTaskWait(xTask t, void **result)Block until task completes. Frees the task handle.Thread-safe
xTaskGroupWaitxErrno xTaskGroupWait(xTaskGroup g)Block until all pending tasks complete.Thread-safe
xTaskGroupThreadssize_t xTaskGroupThreads(xTaskGroup g)Return number of spawned worker threads.Thread-safe (atomic read)
xTaskGroupPendingsize_t xTaskGroupPending(xTaskGroup g)Return number of pending tasks.Thread-safe (atomic read)
xTaskGroupGlobalxTaskGroup xTaskGroupGlobal(void)Get the global shared task group (lazy init).Thread-safe

Usage Examples

Basic Task Submission

#include <stdio.h>
#include <xbase/task.h>

static void *compute(void *arg) {
    int *val = (int *)arg;
    *val *= 2;
    return val;
}

int main(void) {
    xTaskGroup group = xTaskGroupCreate(NULL);

    int value = 21;
    xTask task = xTaskSubmit(group, compute, &value);

    void *result;
    xTaskWait(task, &result);
    printf("Result: %d\n", *(int *)result); // 42

    xTaskGroupDestroy(group);
    return 0;
}

Parallel Map

#include <stdio.h>
#include <xbase/task.h>

#define N 8

static void *square(void *arg) {
    int *val = (int *)arg;
    *val = (*val) * (*val);
    return val;
}

int main(void) {
    xTaskGroupConf conf = { .nthreads = 4, .queue_cap = 0 };
    xTaskGroup group = xTaskGroupCreate(&conf);

    int data[N] = {1, 2, 3, 4, 5, 6, 7, 8};
    xTask tasks[N];

    for (int i = 0; i < N; i++)
        tasks[i] = xTaskSubmit(group, square, &data[i]);

    // Wait for all
    xTaskGroupWait(group);

    for (int i = 0; i < N; i++)
        printf("data[%d] = %d\n", i, data[i]);

    // Clean up task handles
    for (int i = 0; i < N; i++)
        xTaskWait(tasks[i], NULL);

    xTaskGroupDestroy(group);
    return 0;
}

Using the Global Task Group

#include <stdio.h>
#include <xbase/task.h>

static void *work(void *arg) {
    printf("Running on global pool: %s\n", (char *)arg);
    return NULL;
}

int main(void) {
    xTask t = xTaskSubmit(xTaskGroupGlobal(), work, "hello");
    xTaskWait(t, NULL);
    // No need to destroy the global group
    return 0;
}

Use Cases

  1. CPU-Bound Parallel Processing — Distribute computation across multiple cores. Use xTaskGroupWait() to synchronize at barriers.

  2. Event Loop Offload — The event loop's xEventLoopSubmit() uses xTaskGroup internally to run work functions on worker threads, then delivers results back to the loop thread.

  3. Background I/O — Offload blocking file I/O (e.g., fsync, large reads) to a thread pool to keep the main thread responsive.

Best Practices

  • Always call xTaskWait() or let xTaskGroupDestroy() clean up. Each xTaskSubmit() allocates a task struct with a mutex and condvar. xTaskWait() frees them. Leaking task handles leaks resources.
  • Set queue_cap for backpressure. Without a cap, unbounded submission can exhaust memory. A bounded queue lets you detect overload via NULL returns from xTaskSubmit().
  • Don't destroy the global group. xTaskGroupGlobal() is managed internally and destroyed at atexit(). Passing it to xTaskGroupDestroy() is undefined behavior.
  • Use xTaskGroupWait() for barriers, not busy-polling. It uses a dedicated condition variable and blocks efficiently.

Comparison with Other Libraries

Featurexbase task.hpthreadC11 threadsGCD (libdispatch)
AbstractionTask (submit/wait)Thread (create/join)Thread (create/join)Block (dispatch_async)
Thread ManagementAutomatic (lazy spawn)ManualManualAutomatic
QueueBuilt-in FIFO with capN/AN/ABuilt-in (serial/concurrent)
Result RetrievalxTaskWait(t, &result)pthread_join(t, &result)thrd_join(t, &result)Completion handler
Group WaitxTaskGroupWait()Manual barrierManual barrierdispatch_group_wait()
Backpressurequeue_cap → NULL on fullN/AN/AN/A (unbounded)
Global PoolxTaskGroupGlobal()N/AN/Adispatch_get_global_queue()
PlatformmacOS + LinuxPOSIXC11macOS + Linux (via libdispatch)
DependenciespthreadOSOSOS / libdispatch

Key Differentiator: xbase's task model provides a simple, portable thread pool with lazy spawning and explicit backpressure — features that require significant boilerplate with raw pthreads. Unlike GCD, it gives you direct control over thread count and queue capacity.

memory.h — Reference-Counted Memory Management

Introduction

memory.h provides a vtable-driven, reference-counted memory management system for C. It enables object lifecycle management (construction, destruction, retain, release, copy, move) through a virtual table pattern, bringing RAII-like semantics to pure C. The XMALLOC(T) macro allocates an object with an embedded header that tracks the reference count and vtable pointer.

Design Philosophy

  1. vtable-Driven Lifecycle — Each object type defines a static xVTable with optional function pointers for ctor, dtor, retain, release, copy, and move. This decouples lifecycle logic from the allocation mechanism, similar to C++ virtual destructors or Objective-C's class methods.

  2. Hidden Header Pattern — A Header struct is prepended to every allocation, storing the type name (for debugging), size, reference count, and vtable pointer. The user receives a pointer past the header, so the header is invisible to normal usage.

  3. Atomic Reference CountingxRetain() and xRelease() use atomic operations (__ATOMIC_SEQ_CST) to safely manage reference counts across threads. When the count reaches zero, the destructor is called and memory is freed.

  4. Macro ConvenienceXMALLOC(T) and XMALLOCEX(T, sz) macros generate the correct xAlloc() call with the type name string, size, and vtable pointer, reducing boilerplate.

Architecture

graph TD
    MACRO["XMALLOC(T) / XMALLOCEX(T, sz)"]
    ALLOC["xAlloc(name, size, count, vtab)"]
    HEADER["Header + Object"]
    RETAIN["xRetain(ptr)<br/>atomic refs++"]
    RELEASE["xRelease(ptr)<br/>atomic refs--"]
    FREE["xFree(ptr)<br/>dtor + free"]
    COPY["xCopy(ptr, other)"]
    MOVE["xMove(ptr, other)"]

    MACRO --> ALLOC
    ALLOC --> HEADER
    HEADER --> RETAIN
    HEADER --> RELEASE
    RELEASE -->|"refs == 0"| FREE
    HEADER --> COPY
    HEADER --> MOVE

    style MACRO fill:#4a90d9,color:#fff
    style RELEASE fill:#e74c3c,color:#fff
    style FREE fill:#e74c3c,color:#fff

Implementation Details

Memory Layout

graph LR
    subgraph "malloc'd block"
        HDR["Header<br/>name | size | refs | vtab"]
        OBJ["User Object<br/>(sizeof(T) bytes)"]
        EXTRA["Extra bytes<br/>(XMALLOCEX only)"]
    end

    PTR["xAlloc() returns →"] --> OBJ

    style HDR fill:#f5a623,color:#fff
    style OBJ fill:#4a90d9,color:#fff
    style EXTRA fill:#50b86c,color:#fff

The actual memory layout:

┌──────────────────────────────────────────────────────┐
│ Header (hidden)                                      │
│   const char *name   — type name string (e.g. "Foo") │
│   size_t      size   — sizeof(T)                     │
│   size_t      refs   — reference count (starts at 1) │
│   xVTable    *vtab   — pointer to static vtable      │
├──────────────────────────────────────────────────────┤
│ User Object (returned pointer)                       │
│   T fields...                                        │
│   [optional extra bytes from XMALLOCEX]              │
└──────────────────────────────────────────────────────┘

XMALLOC / XMALLOCEX Macro Expansion

// Given:
typedef struct Foo Foo;
struct Foo { int x; char buf[]; };

XDEF_VTABLE(Foo) { .ctor = FooCtor, .dtor = FooDtor };
XDEF_CTOR(Foo) { self->x = 0; }
XDEF_DTOR(Foo) { /* cleanup */ }

// XMALLOC(Foo) expands to:
(Foo *)xAlloc("Foo", sizeof(Foo), 1, &FooVTable)

// XMALLOCEX(Foo, 128) expands to:
(Foo *)xAlloc("Foo", sizeof(Foo) + 128, 1, &FooVTable)

Reference Count Lifecycle

sequenceDiagram
    participant App
    participant Alloc as xAlloc
    participant Header
    participant VTable

    App->>Alloc: XMALLOC(Foo)
    Alloc->>Header: malloc(sizeof(Header) + sizeof(Foo))
    Alloc->>Header: refs = 1
    Alloc->>VTable: vtab->ctor(ptr)
    Alloc-->>App: Foo *ptr

    App->>Header: xRetain(ptr) → refs = 2
    App->>Header: xRelease(ptr) → refs = 1
    App->>Header: xRelease(ptr) → refs = 0
    Header->>VTable: vtab->release(ptr)
    Header->>VTable: vtab->dtor(ptr)
    Header->>Header: free(hdr)

Thread Safety

  • xRetain() and xRelease() are thread-safe — they use xAtomicAdd / xAtomicSub with sequential consistency ordering.
  • xAlloc(), xFree(), xCopy(), and xMove() are not thread-safe — they should be called from a single owner or with external synchronization.

API Reference

Macros

MacroExpansionDescription
XDEF_VTABLE(T)static xVTable TVTable =Define a static vtable for type T
XDEF_CTOR(T)static void TCtor(T *self)Define a constructor for type T
XDEF_DTOR(T)static void TDtor(T *self)Define a destructor for type T
XMALLOC(T)(T *)xAlloc("T", sizeof(T), 1, &TVTable)Allocate one T with vtable
XMALLOCEX(T, sz)(T *)xAlloc("T", sizeof(T) + sz, 1, &TVTable)Allocate T + extra bytes

Types

TypeDescription
xVTableStruct with function pointers: ctor, dtor, retain, release, copy, move

Functions

FunctionSignatureDescriptionThread Safety
xAllocvoid *xAlloc(const char *name, size_t size, size_t count, xVTable *vtab)Allocate object(s) with header and call ctor.Not thread-safe
xFreevoid xFree(void *ptr)Call dtor and free. Ignores NULL.Not thread-safe
xRetainvoid xRetain(void *ptr)Increment reference count atomically. Calls vtab->retain if set.Thread-safe
xReleasevoid xRelease(void *ptr)Decrement reference count atomically. Calls vtab->release then xFree when refs reach 0.Thread-safe
xCopyvoid xCopy(void *ptr, void *other)Call vtab->copy if set.Not thread-safe
xMovevoid xMove(void *ptr, void *other)Call vtab->move if set.Not thread-safe

Usage Examples

Basic Object with Constructor/Destructor

#include <stdio.h>
#include <string.h>
#include <xbase/memory.h>

typedef struct Connection Connection;
struct Connection {
    int fd;
    char host[256];
};

XDEF_CTOR(Connection) {
    self->fd = -1;
    memset(self->host, 0, sizeof(self->host));
    printf("Connection created\n");
}

XDEF_DTOR(Connection) {
    if (self->fd >= 0) {
        // close(self->fd);
        printf("Connection closed (fd=%d)\n", self->fd);
    }
}

XDEF_VTABLE(Connection) {
    .ctor = ConnectionCtor,
    .dtor = ConnectionDtor,
};

int main(void) {
    Connection *conn = XMALLOC(Connection);
    conn->fd = 42;
    strcpy(conn->host, "example.com");

    xRetain(conn);   // refs = 2
    xRelease(conn);  // refs = 1
    xRelease(conn);  // refs = 0 → dtor called → freed

    return 0;
}

Flexible Array Member with XMALLOCEX

#include <stdio.h>
#include <string.h>
#include <xbase/memory.h>

typedef struct Buffer Buffer;
struct Buffer {
    size_t len;
    char   data[];  // flexible array member
};

XDEF_CTOR(Buffer) { self->len = 0; }
XDEF_DTOR(Buffer) { /* nothing to clean up */ }
XDEF_VTABLE(Buffer) { .ctor = BufferCtor, .dtor = BufferDtor };

int main(void) {
    // Allocate Buffer + 1024 extra bytes for data[]
    Buffer *buf = XMALLOCEX(Buffer, 1024);

    memcpy(buf->data, "Hello, xKit!", 12);
    buf->len = 12;

    printf("Buffer: %.*s\n", (int)buf->len, buf->data);

    xRelease(buf); // refs 1 → 0 → freed
    return 0;
}

Use Cases

  1. Shared Ownership — Multiple components hold references to the same object (e.g., a connection shared between a reader and a writer). xRetain/xRelease ensures the object is freed only when the last reference is dropped.

  2. Plugin/Extension Objects — Define vtables for different object types that share a common interface. The vtable pattern enables polymorphic behavior in C.

  3. Debug-Friendly Allocation — The name field in the header enables allocation tracking and leak detection by type name.

Best Practices

  • Always pair xRetain with xRelease. Every retain must have a corresponding release, or you'll leak memory.
  • Use XMALLOC instead of raw xAlloc. The macro handles type name, size, and vtable automatically.
  • Set unused vtable fields to NULL. The implementation checks for NULL before calling each vtable function.
  • Don't mix with free(). Objects allocated with xAlloc have a hidden header. Calling free() directly on the user pointer corrupts the heap.
  • Use XMALLOCEX for flexible array members. It adds extra bytes after the struct for variable-length data.

Comparison with Other Libraries

Featurexbase memory.hC++ RAIIObjective-C ARCGLib GObject
Mechanismvtable + atomic refcountDestructor + smart pointersCompiler-inserted retain/releaseGType + refcount
AutomationManual retain/releaseAutomatic (scope-based)Automatic (compiler)Manual ref/unref
Thread SafetyAtomic refcountshared_ptr is atomicAtomicAtomic
Polymorphismvtable function pointersVirtual functionsMethod dispatchSignal/slot + vtable
Overhead1 header per object (~32 bytes)0 (stack) or control block1 isa pointer + refcountLarge (GTypeInstance)
Flexible ArraysXMALLOCEX(T, sz)std::vectorNSMutableDataGArray
Debug InfoType name in headerRTTIClass nameGType name
LanguageC99C++Objective-CC (with macros)

Key Differentiator: xbase's memory system brings reference-counted lifecycle management to C with minimal overhead — just a 32-byte header per object. The vtable pattern provides extensibility (custom ctor/dtor/copy/move) without requiring a complex type system like GObject.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbase/memory_bench.cpp

BenchmarkSize (bytes)Time (ns)CPU (ns)Iterations
BM_Memory_XAlloc1623.323.329,809,940
BM_Memory_XAlloc6421.121.132,551,024
BM_Memory_XAlloc25622.422.431,207,508
BM_Memory_XAlloc1,02420.120.134,024,352
BM_Memory_XAlloc4,09624.224.229,002,681
BM_Memory_Malloc1617.517.539,883,995
BM_Memory_Malloc6418.718.737,576,831
BM_Memory_Malloc25619.019.034,505,536
BM_Memory_Malloc1,02423.023.030,557,144
BM_Memory_Malloc4,09617.717.739,849,483
BM_Memory_RetainRelease3.903.90183,068,277

Key Observations:

  • xAlloc vs malloc overhead is only ~3–5ns across all sizes. The extra cost covers header initialization, vtable setup, and constructor invocation — negligible for most workloads.
  • Retain/Release cycle takes ~3.9ns, dominated by the atomic increment/decrement. This is fast enough for hot-path reference counting.
  • Allocation time is nearly constant across sizes (16B–4KB), confirming that the overhead is in the header management, not the underlying malloc.

error.h — Unified Error Codes

Introduction

error.h defines a unified set of error codes (xErrno) used throughout xKit. Every function that can fail returns an xErrno value, providing a consistent error handling pattern across all modules. The companion function xstrerror() converts error codes to human-readable strings for logging and debugging.

Design Philosophy

  1. Single Error Enum — All xKit modules share one error code enum, avoiding the confusion of module-specific error types. This makes error handling uniform: check for xErrno_Ok everywhere.

  2. Descriptive Codes — Each error code maps to a specific failure category (invalid argument, out of memory, wrong state, etc.), giving callers enough information to decide how to handle the error without inspecting errno or platform-specific codes.

  3. Human-Readable Messagesxstrerror() returns a static string for each code, suitable for direct inclusion in log messages. It never returns NULL.

Architecture

graph LR
    MODULES["All xKit Modules"] -->|"return"| ERRNO["xErrno"]
    ERRNO -->|"xstrerror()"| MSG["Human-readable string"]
    MSG -->|"xLog()"| LOG["Log output"]

    style ERRNO fill:#4a90d9,color:#fff
    style MSG fill:#50b86c,color:#fff

Implementation Details

Error Code Values

The error codes are defined as an int-based enum (via XDEF_ENUM), starting from 0:

CodeValueMeaning
xErrno_Ok0Success
xErrno_Unknown1Unspecified error (legacy / catch-all)
xErrno_InvalidArg2NULL or invalid argument
xErrno_NoMemory3Memory allocation failed
xErrno_InvalidState4Object is in the wrong state for this call
xErrno_SysError5Underlying syscall / OS error
xErrno_NotFound6Requested item does not exist
xErrno_AlreadyExists7Item already registered / bound
xErrno_Cancelled8Operation was cancelled

Usage Pattern

The idiomatic xKit error handling pattern:

xErrno err = xSomeFunction(args);
if (err != xErrno_Ok) {
    xLog(false, "operation failed: %s", xstrerror(err));
    return err; // propagate
}

Internal Usage

xErrno is used by:

  • event.hxEventMod(), xEventDel(), xEventWake(), xEventLoopTimerCancel(), xEventLoopSubmit(), xEventLoopSignalWatch()
  • timer.hxTimerCancel()
  • task.hxTaskWait(), xTaskGroupWait()
  • socket.hxSocketSetMask(), xSocketSetTimeout()
  • heap.hxHeapPush(), xHeapUpdate()

API Reference

Types

TypeDescription
xErrnoint-based enum of error codes

Enum Values

ValueDescription
xErrno_OkSuccess
xErrno_UnknownUnspecified error (legacy / catch-all)
xErrno_InvalidArgNULL or invalid argument
xErrno_NoMemoryMemory allocation failed
xErrno_InvalidStateObject is in the wrong state for this call
xErrno_SysErrorUnderlying syscall / OS error
xErrno_NotFoundRequested item does not exist
xErrno_AlreadyExistsItem already registered / bound
xErrno_CancelledOperation was cancelled

Functions

FunctionSignatureDescriptionThread Safety
xstrerrorconst char *xstrerror(xErrno err)Return a human-readable error message. Never returns NULL.Thread-safe (returns static strings)

Usage Examples

Error Handling Pattern

#include <stdio.h>
#include <xbase/error.h>
#include <xbase/event.h>

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    if (!loop) {
        fprintf(stderr, "Failed to create event loop\n");
        return 1;
    }

    xErrno err = xEventMod(loop, NULL, xEvent_Read);
    if (err != xErrno_Ok) {
        fprintf(stderr, "xEventMod failed: %s\n", xstrerror(err));
        // Output: "xEventMod failed: NULL or invalid argument"
    }

    xEventLoopDestroy(loop);
    return 0;
}

Propagating Errors

#include <xbase/error.h>
#include <xbase/socket.h>

xErrno setup_socket(xEventLoop loop, xSocket *out) {
    xSocket sock = xSocketCreate(loop, AF_INET, SOCK_STREAM, 0,
                                  xEvent_Read, my_callback, NULL);
    if (!sock) return xErrno_SysError;

    xErrno err = xSocketSetTimeout(sock, 5000, 0);
    if (err != xErrno_Ok) {
        xSocketDestroy(loop, sock);
        return err;
    }

    *out = sock;
    return xErrno_Ok;
}

Use Cases

  1. Uniform Error Propagation — Functions return xErrno and callers check against xErrno_Ok. This eliminates the need for module-specific error types.

  2. Logging and Diagnosticsxstrerror() provides instant human-readable messages for log output without maintaining separate message tables.

  3. Error Classification — Callers can switch on specific error codes to implement different recovery strategies (e.g., retry on xErrno_SysError, abort on xErrno_NoMemory).

Best Practices

  • Always check return values. Functions that return xErrno should be checked. Functions that return handles (pointers) should be checked for NULL.
  • Use xstrerror() in log messages. It's more informative than printing the raw integer.
  • Don't compare against raw integers. Always use the enum constants (xErrno_Ok, xErrno_InvalidArg, etc.) for readability and forward compatibility.
  • Prefer specific codes over xErrno_Unknown. When adding new error paths, choose the most specific applicable code.

Comparison with Other Libraries

Featurexbase error.hPOSIX errnoWindows HRESULTGLib GError
Typeint enumint (thread-local)LONGStruct (domain + code + message)
ScopeLibrary-wideSystem-wideSystem-widePer-domain
String Conversionxstrerror()strerror()FormatMessage()g_error->message
Thread SafetyReturn value (inherently safe)Thread-local globalReturn valueHeap-allocated
ExtensibilityAdd to enumPlatform-definedFacility codesCustom domains
OverheadZero (int return)Zero (thread-local)Zero (int return)Heap allocation per error

Key Differentiator: xbase's error system is intentionally simple — a single enum with descriptive codes and a string conversion function. It avoids the complexity of domain-based systems (GError) and the thread-local pitfalls of POSIX errno, while providing enough granularity for library-level error handling.

heap.h — Min-Heap

Introduction

heap.h provides a generic binary min-heap that stores opaque pointers and orders them via a user-supplied comparison function. Each element carries its heap index (maintained via a callback), enabling O(log n) removal and priority updates by index. It is the core data structure behind xbase's timer subsystem.

Design Philosophy

  1. Generic via Function Pointers — The heap stores void * elements and uses a xHeapCmpFunc for ordering. This makes it reusable for any element type without code generation or macros.

  2. Index Tracking — A xHeapSetIdxFunc callback notifies elements of their current position in the heap array. This enables O(1) lookup for xHeapRemove() and xHeapUpdate(), which would otherwise require O(n) search.

  3. Dynamic Array Backend — The heap uses a dynamically-growing array (2x expansion) starting from a default capacity of 16. This provides cache-friendly access patterns and amortized O(1) growth.

  4. No Element Ownership — The heap does not own the elements it stores. xHeapDestroy() frees the heap structure but NOT the elements. This gives the caller full control over element lifecycle.

Architecture

graph TD
    PUSH["xHeapPush(elem)"] --> APPEND["Append to data[size]"]
    APPEND --> SIFTUP["Sift Up"]
    SIFTUP --> NOTIFY["setidx(elem, new_idx)"]

    POP["xHeapPop()"] --> SWAP["Swap data[0] with data[size-1]"]
    SWAP --> SIFTDOWN["Sift Down from 0"]
    SIFTDOWN --> NOTIFY

    REMOVE["xHeapRemove(idx)"] --> SWAP2["Swap data[idx] with data[size-1]"]
    SWAP2 --> BOTH["Sift Up + Sift Down"]
    BOTH --> NOTIFY

    style PUSH fill:#4a90d9,color:#fff
    style POP fill:#f5a623,color:#fff
    style REMOVE fill:#e74c3c,color:#fff

Implementation Details

Data Structure

struct xHeap_ {
    void          **data;    // Dynamic array of element pointers
    size_t          size;    // Current number of elements
    size_t          cap;     // Allocated capacity
    xHeapCmpFunc    cmp;     // Comparison function
    xHeapSetIdxFunc setidx;  // Index notification callback
};

Array Layout

Index:  0     1     2     3     4     5     6
       [min] [  ] [  ] [  ] [  ] [  ] [  ]
        │     │    │
        │     ├────┤
        │     children of 0
        ├─────┤
        parent of 1,2

Parent of i:     (i - 1) / 2
Left child of i:  2 * i + 1
Right child of i: 2 * i + 2

Operations and Complexity

OperationFunctionTime ComplexityDescription
InsertxHeapPushO(log n)Append to end, sift up
Peek minxHeapPeekO(1)Return data[0]
Extract minxHeapPopO(log n)Swap with last, sift down
Remove by indexxHeapRemoveO(log n)Swap with last, sift up + down
Update priorityxHeapUpdateO(log n)Sift up + down at index
SizexHeapSizeO(1)Return size field
Growensure_capAmortized O(1)2x realloc

Sift Operations

  • Sift Up — Compare element with parent; swap if smaller. Repeat until heap property is restored or root is reached.
  • Sift Down — Compare element with children; swap with the smallest child if it's smaller. Repeat until heap property is restored or a leaf is reached.

Remove by Index

xHeapRemove(h, idx) replaces the element at idx with the last element, then applies both sift-up and sift-down. This handles both cases: the replacement may be smaller (needs to go up) or larger (needs to go down) than its new neighbors.

API Reference

Types

TypeDescription
xHeapCmpFuncint (*)(const void *a, const void *b) — Returns negative if a < b, 0 if equal, positive if a > b
xHeapSetIdxFuncvoid (*)(void *elem, size_t idx) — Called when an element's index changes
xHeapOpaque handle to a min-heap

Functions

FunctionSignatureDescriptionThread Safety
xHeapCreatexHeap xHeapCreate(xHeapCmpFunc cmp, xHeapSetIdxFunc setidx, size_t cap)Create a heap. cap = 0 uses default (16).Not thread-safe
xHeapDestroyvoid xHeapDestroy(xHeap h)Free the heap. Does NOT free elements.Not thread-safe
xHeapPushxErrno xHeapPush(xHeap h, void *elem)Insert an element. O(log n).Not thread-safe
xHeapPeekvoid *xHeapPeek(xHeap h)Return the minimum element without removing. O(1).Not thread-safe
xHeapPopvoid *xHeapPop(xHeap h)Remove and return the minimum element. O(log n).Not thread-safe
xHeapRemovevoid *xHeapRemove(xHeap h, size_t idx)Remove element at index. O(log n).Not thread-safe
xHeapUpdatexErrno xHeapUpdate(xHeap h, size_t idx)Re-heapify after priority change. O(log n).Not thread-safe
xHeapSizesize_t xHeapSize(xHeap h)Return element count. O(1).Not thread-safe

Usage Examples

Timer-Style Priority Queue

#include <stdio.h>
#include <stdlib.h>
#include <xbase/heap.h>

typedef struct {
    uint64_t deadline;
    size_t   heap_idx;
    char     name[32];
} TimerEntry;

static int cmp_entry(const void *a, const void *b) {
    const TimerEntry *ea = (const TimerEntry *)a;
    const TimerEntry *eb = (const TimerEntry *)b;
    if (ea->deadline < eb->deadline) return -1;
    if (ea->deadline > eb->deadline) return  1;
    return 0;
}

static void set_idx(void *elem, size_t idx) {
    ((TimerEntry *)elem)->heap_idx = idx;
}

int main(void) {
    xHeap heap = xHeapCreate(cmp_entry, set_idx, 0);

    TimerEntry entries[] = {
        { .deadline = 300, .name = "C" },
        { .deadline = 100, .name = "A" },
        { .deadline = 200, .name = "B" },
    };

    for (int i = 0; i < 3; i++)
        xHeapPush(heap, &entries[i]);

    // Pop in order: A (100), B (200), C (300)
    while (xHeapSize(heap) > 0) {
        TimerEntry *e = (TimerEntry *)xHeapPop(heap);
        printf("%s (deadline=%llu)\n", e->name, e->deadline);
    }

    xHeapDestroy(heap);
    return 0;
}

Use Cases

  1. Timer Subsystemtimer.h uses the min-heap to order timer entries by deadline. The timer thread peeks at the minimum to determine how long to sleep, then pops expired entries.

  2. Event Loop Timers — The event loop's builtin timer heap (event.h) uses the same pattern to integrate timer dispatch with I/O polling.

  3. Custom Priority Queues — Any scenario requiring efficient insert/extract-min with O(log n) removal by index.

Best Practices

  • Always implement xHeapSetIdxFunc. Without index tracking, xHeapRemove() and xHeapUpdate() cannot locate elements efficiently.
  • Store the index in your element struct. The setidx callback should write the index into a field of your element (e.g., elem->heap_idx = idx).
  • Don't free elements while they're in the heap. Remove them first with xHeapRemove() or xHeapPop().
  • Use xHeapUpdate() after changing an element's priority. The heap doesn't detect priority changes automatically.

Comparison with Other Libraries

Featurexbase heap.hC++ std::priority_queueLinux kernel prio_heapGo container/heap
Element Typevoid * (generic)TemplateFixed structinterface{}
Index TrackingBuilt-in (setidx callback)Not availableNot availableManual (Fix method)
Remove by IndexO(log n)Not supportedNot supportedO(log n) via Remove
Update PriorityO(log n) via xHeapUpdateNot supportedNot supportedO(log n) via Fix
OwnershipNo (caller owns elements)Yes (copies/moves)NoNo
Thread SafetyNot thread-safeNot thread-safeNot thread-safeNot thread-safe

Key Differentiator: xbase's heap provides built-in index tracking via the setidx callback, enabling O(log n) removal and priority updates — features that std::priority_queue lacks entirely. This makes it ideal for timer implementations where cancellation is a common operation.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbase/heap_bench.cpp

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Heap_Push89839878.1 M items/s
BM_Heap_Push641,6941,69937.7 M items/s
BM_Heap_Push5128,7228,72558.7 M items/s
BM_Heap_Push4,09656,85456,85372.0 M items/s
BM_Heap_Pop81,0201,0247.8 M items/s
BM_Heap_Pop642,8072,80922.8 M items/s
BM_Heap_Pop51226,33426,33719.4 M items/s
BM_Heap_Pop4,096297,382297,32513.8 M items/s
BM_Heap_Remove81,0151,0207.8 M items/s
BM_Heap_Remove641,8081,81135.3 M items/s
BM_Heap_Remove5128,9148,90357.5 M items/s
BM_Heap_Remove4,09668,01768,01660.2 M items/s

Key Observations:

  • Push throughput scales well with heap size — amortized cost per element decreases as batch size grows, reaching 72M items/s at N=4096.
  • Pop is more expensive than push at large N due to the sift-down operation traversing more levels. At N=4096, pop throughput drops to ~14M items/s.
  • Remove (random index removal) performs comparably to push, thanks to the O(log n) index-tracked removal. This validates the setidx callback design for timer cancellation workloads.

mpsc.h — Lock-Free MPSC Queue

Introduction

mpsc.h provides a lock-free, intrusive multi-producer single-consumer (MPSC) queue. Multiple threads can push nodes concurrently without locks, while a single consumer thread pops nodes. It is the backbone of xbase's poll-mode timer dispatch and the event loop's offload completion queue.

Design Philosophy

  1. Intrusive Design — Nodes embed an xMpsc struct directly, avoiding heap allocation per enqueue. This is critical for hot paths like timer expiry and offload completion where allocation overhead would be unacceptable.

  2. Lock-Free PushxMpscPush() uses a single atomic exchange (xAtomicXchg) on the tail pointer, making it wait-free for producers. No mutex, no CAS retry loop.

  3. Single-Consumer PopxMpscPop() is designed for exactly one consumer thread. It uses atomic loads and a single CAS for the edge case of popping the last element. This simplification avoids the ABA problem that plagues multi-consumer designs.

  4. Minimal Memory Ordering — The implementation uses xAtomicAcqRel for the exchange and xAtomicAcquire/xAtomicRelease for loads/stores, providing the minimum ordering needed for correctness without the overhead of sequential consistency.

Architecture

graph LR
    P1["Producer 1"] -->|"xMpscPush"| TAIL["tail"]
    P2["Producer 2"] -->|"xMpscPush"| TAIL
    P3["Producer 3"] -->|"xMpscPush"| TAIL

    HEAD["head"] -->|"xMpscPop"| C["Consumer"]

    subgraph "Queue"
        HEAD --> N1["Node 1"] --> N2["Node 2"] --> N3["Node 3"]
        N3 --- TAIL
    end

    style P1 fill:#4a90d9,color:#fff
    style P2 fill:#4a90d9,color:#fff
    style P3 fill:#4a90d9,color:#fff
    style C fill:#50b86c,color:#fff

Implementation Details

Data Structure

XDEF_STRUCT(xMpsc) {
    xMpsc *volatile next;  // Pointer to next node
};

The queue is represented by two external pointers:

  • head — Points to the oldest node (consumer reads from here)
  • tail — Points to the newest node (producers append here)

Push Algorithm

void xMpscPush(xMpsc **head, xMpsc **tail, xMpsc *node) {
    node->next = NULL;
    xMpsc *prev_tail = xAtomicXchg(tail, node, xAtomicAcqRel);
    if (prev_tail)
        prev_tail->next = node;  // Link to previous tail
    else
        xAtomicStore(head, node, xAtomicRelease);  // First node
}

The key insight: xAtomicXchg atomically replaces the tail and returns the old value. If the old tail was non-NULL, we link it to the new node. If it was NULL (empty queue), we also update the head.

Pop Algorithm

The pop operation handles three cases:

  1. Empty queuehead is NULL, return NULL.
  2. Multiple nodes — Advance head to head->next, return old head.
  3. Single node — CAS tail to NULL. If CAS succeeds, also CAS head to NULL. If CAS fails (concurrent push in progress), spin until head->next becomes non-NULL.
flowchart TD
    START["xMpscPop()"]
    CHECK_HEAD{"head == NULL?"}
    EMPTY["Return NULL"]
    CHECK_NEXT{"head->next == NULL?"}
    MULTI["Advance head<br/>Return old head"]
    CAS_TAIL{"CAS tail → NULL?"}
    CAS_HEAD["CAS head → NULL<br/>Return old head"]
    SPIN["Spin until head->next != NULL"]
    ADVANCE["Advance head<br/>Return old head"]

    START --> CHECK_HEAD
    CHECK_HEAD -->|Yes| EMPTY
    CHECK_HEAD -->|No| CHECK_NEXT
    CHECK_NEXT -->|No| MULTI
    CHECK_NEXT -->|Yes| CAS_TAIL
    CAS_TAIL -->|Success| CAS_HEAD
    CAS_TAIL -->|Fail: concurrent push| SPIN
    SPIN --> ADVANCE

    style EMPTY fill:#e74c3c,color:#fff
    style MULTI fill:#50b86c,color:#fff
    style CAS_HEAD fill:#50b86c,color:#fff
    style ADVANCE fill:#50b86c,color:#fff

Memory Ordering Analysis

OperationOrderingReason
xAtomicXchg(tail, node)AcqRelAcquire: see previous tail's next field. Release: make node visible to consumer.
xAtomicStore(head, node)ReleaseMake the new head visible to the consumer.
xAtomicLoad(head)AcquireSee the node written by the producer.
xAtomicLoad(&head->next)AcquireSee the next pointer written by the producer.
xAtomicCasStrong(tail, ...)ReleasePublish the NULL tail to concurrent pushers.

Thread Safety

  • xMpscPush()Thread-safe (multiple producers).
  • xMpscPop()Single-consumer only. Must not be called concurrently.
  • xMpscEmpty()Thread-safe (atomic load).

API Reference

Types

TypeDescription
xMpscIntrusive queue node. Embed in your struct and use xContainerOf() to recover the enclosing struct.

Functions

FunctionSignatureDescriptionThread Safety
xMpscPushvoid xMpscPush(xMpsc **head, xMpsc **tail, xMpsc *node)Push a node. Wait-free for producers.Thread-safe (multi-producer)
xMpscPopxMpsc *xMpscPop(xMpsc **head, xMpsc **tail)Pop the oldest node. Returns NULL if empty.Single-consumer only
xMpscEmptybool xMpscEmpty(xMpsc **head)Check if the queue is empty.Thread-safe

Usage Examples

Basic Producer-Consumer

#include <stdio.h>
#include <pthread.h>
#include <xbase/mpsc.h>
#include <xbase/base.h>

typedef struct {
    xMpsc node;   // Must embed xMpsc
    int   value;
} Message;

static xMpsc *g_head = NULL;
static xMpsc *g_tail = NULL;

static void *producer(void *arg) {
    Message *msg = (Message *)arg;
    xMpscPush(&g_head, &g_tail, &msg->node);
    return NULL;
}

int main(void) {
    Message msgs[] = {
        { .value = 1 },
        { .value = 2 },
        { .value = 3 },
    };

    // Push from multiple threads
    pthread_t threads[3];
    for (int i = 0; i < 3; i++)
        pthread_create(&threads[i], NULL, producer, &msgs[i]);
    for (int i = 0; i < 3; i++)
        pthread_join(threads[i], NULL);

    // Pop from single consumer
    xMpsc *node;
    while ((node = xMpscPop(&g_head, &g_tail)) != NULL) {
        Message *msg = xContainerOf(node, Message, node);
        printf("Received: %d\n", msg->value);
    }

    return 0;
}

Use Cases

  1. Timer Poll Modetimer.h uses the MPSC queue in poll mode to pass expired timer entries from the timer thread to the polling thread without locks.

  2. Event Loop Offload — The event loop's offload mechanism (event.h) uses an MPSC queue to deliver completed work items from worker threads to the event loop thread.

  3. xlog Async Loggerlogger.h uses the MPSC queue to pass log messages from application threads to the logger's flush thread.

Best Practices

  • Embed xMpsc in your struct. Don't allocate xMpsc nodes separately. Use xContainerOf() to recover the enclosing struct after popping.
  • Initialize head and tail to NULL. An empty queue has both pointers set to NULL.
  • Only one thread may call xMpscPop(). The single-consumer constraint is fundamental to the algorithm's correctness. Violating it causes data races.
  • Don't access a node after pushing it. Once pushed, the node is owned by the queue until popped.

Comparison with Other Libraries

Featurexbase mpsc.hDmitry Vyukov MPSCconcurrentqueue (C++)Linux llist
DesignIntrusive, lock-freeIntrusive, lock-freeNon-intrusive, lock-freeIntrusive, lock-free
PushWait-free (1 atomic xchg)Wait-free (1 atomic xchg)Lock-free (CAS loop)Wait-free (1 atomic xchg)
PopLock-free (single consumer)Lock-free (single consumer)Lock-free (multi-consumer)Batch pop (splice)
Memory OrderingAcqRel / Acquire / ReleaseSeqCstRelaxed + fencesVaries
AllocationNone (intrusive)None (intrusive)Per-element (internal)None (intrusive)
Multi-ConsumerNoNoYesNo (batch only)
LanguageC99C/C++C++11C (kernel)

Key Differentiator: xbase's MPSC queue is minimal and intrusive — zero allocation overhead, wait-free push, and carefully chosen memory orderings. It's designed specifically for the single-consumer patterns found in event loops and timer systems.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbase/mpsc_bench.cpp

BenchmarkTime (ns)CPU (ns)IterationsThroughput
BM_Mpsc_SingleProducer3,7123,712187,897275.9 M items/s
BM_Mpsc_MultiProducer/2609,43287,7978,075227.8 M items/s
BM_Mpsc_MultiProducer/41,327,965148,3564,768269.6 M items/s
BM_Mpsc_MultiProducer/84,466,805292,2601,000273.7 M items/s

Key Observations:

  • Single-producer push/pop achieves ~276M items/s, demonstrating the minimal overhead of the lock-free algorithm.
  • Multi-producer scaling maintains ~270M items/s aggregate throughput even with 8 concurrent producers, showing excellent scalability. The wall-clock time increases due to thread synchronization overhead, but per-CPU throughput remains stable.
  • The gap between wall-clock time and CPU time in multi-producer benchmarks reflects the cost of thread creation and barrier synchronization, not the queue operations themselves.

atomic.h — Atomic Operations

Introduction

atomic.h provides a set of macro wrappers over GCC/Clang __atomic builtins, offering portable atomic operations with explicit memory ordering. These macros are used throughout xbase for reference counting (memory.h), lock-free queues (mpsc.h), and event loop internals (event.h).

Design Philosophy

  1. Thin Macro Wrappers — Each macro maps directly to a compiler builtin with zero overhead. No abstraction layers, no runtime dispatch.

  2. Explicit Memory Ordering — Every atomic operation requires an explicit memory order parameter (xAtomicAcquire, xAtomicRelease, etc.), forcing the programmer to think about ordering requirements rather than defaulting to the expensive SeqCst.

  3. GCC/Clang Builtins — The __atomic builtins are supported by GCC ≥ 4.7 and all versions of Clang. They generate optimal instructions for each target architecture (x86: lock prefix, ARM: ldrex/strex or LSE atomics).

Architecture

graph TD
    subgraph "xbase Atomic Users"
        MEMORY["memory.h<br/>xRetain / xRelease<br/>(SeqCst refcount)"]
        MPSC["mpsc.h<br/>xMpscPush / xMpscPop<br/>(AcqRel / Acquire / Release)"]
        EVENT["event_private.h<br/>inflight counter<br/>(Relaxed)"]
        TASK["task.c<br/>pending / done_count<br/>(stdatomic)"]
    end

    subgraph "atomic.h Macros"
        LOAD["xAtomicLoad"]
        STORE["xAtomicStore"]
        XCHG["xAtomicXchg"]
        CAS["xAtomicCas*"]
        ADD["xAtomicAdd/Sub"]
        FETCH["xAtomicFetch*"]
    end

    MEMORY --> ADD
    MPSC --> XCHG
    MPSC --> LOAD
    MPSC --> STORE
    MPSC --> CAS
    EVENT --> FETCH

    style MEMORY fill:#4a90d9,color:#fff
    style MPSC fill:#f5a623,color:#fff
    style EVENT fill:#50b86c,color:#fff

Implementation Details

Memory Order Constants

MacroValueMeaning
xAtomicRelaxed__ATOMIC_RELAXEDNo ordering constraints. Only guarantees atomicity.
xAtomicConsume__ATOMIC_CONSUMEData-dependent ordering (rarely used in practice).
xAtomicAcquire__ATOMIC_ACQUIREPrevents reads/writes from being reordered before this operation.
xAtomicRelease__ATOMIC_RELEASEPrevents reads/writes from being reordered after this operation.
xAtomicAcqRel__ATOMIC_ACQ_RELCombines Acquire and Release.
xAtomicSeqCst__ATOMIC_SEQ_CSTFull sequential consistency. Most expensive.

Operation Macros

Load / Store

MacroExpansionDescription
xAtomicLoad(p, o)__atomic_load_n(p, o)Atomically read *p
xAtomicStore(p, v, o)__atomic_store_n(p, v, o)Atomically write v to *p

Exchange / CAS

MacroExpansionDescription
xAtomicXchg(p, v, o)__atomic_exchange_n(p, v, o)Atomically swap *p with v, return old value
xAtomicCasWeak(p, e, d, o)__atomic_compare_exchange_n(p, e, d, true, o, Relaxed)Weak CAS (may spuriously fail)
xAtomicCasStrong(p, e, d, o)__atomic_compare_exchange_n(p, e, d, false, o, Relaxed)Strong CAS (no spurious failure)

Note: Both CAS macros use xAtomicRelaxed as the failure ordering. The success ordering is specified by the o parameter.

Arithmetic

MacroExpansionReturns
xAtomicAdd(p, v, o)__atomic_add_fetch(p, v, o)New value (*p + v)
xAtomicSub(p, v, o)__atomic_sub_fetch(p, v, o)New value (*p - v)
xAtomicFetchAdd(p, v, o)__atomic_fetch_add(p, v, o)Old value (before add)
xAtomicFetchSub(p, v, o)__atomic_fetch_sub(p, v, o)Old value (before sub)

Bitwise

MacroExpansionReturns
xAtomicAnd(p, v, o)__atomic_and_fetch(p, v, o)New value
xAtomicOr(p, v, o)__atomic_or_fetch(p, v, o)New value
xAtomicXor(p, v, o)__atomic_xor_fetch(p, v, o)New value
xAtomicNand(p, v, o)__atomic_nand_fetch(p, v, o)New value
xAtomicFetchAnd(p, v, o)__atomic_fetch_and(p, v, o)Old value
xAtomicFetchOr(p, v, o)__atomic_fetch_or(p, v, o)Old value
xAtomicFetchXor(p, v, o)__atomic_fetch_xor(p, v, o)Old value

API Reference

See the Operation Macros section above for the complete list. All macros are defined in <xbase/atomic.h> and require no function calls — they expand directly to compiler builtins.

Usage Examples

Atomic Counter

#include <stdio.h>
#include <pthread.h>
#include <xbase/atomic.h>

static int g_counter = 0;

static void *increment(void *arg) {
    (void)arg;
    for (int i = 0; i < 100000; i++) {
        xAtomicAdd(&g_counter, 1, xAtomicRelaxed);
    }
    return NULL;
}

int main(void) {
    pthread_t threads[4];
    for (int i = 0; i < 4; i++)
        pthread_create(&threads[i], NULL, increment, NULL);
    for (int i = 0; i < 4; i++)
        pthread_join(threads[i], NULL);

    printf("Counter: %d\n", xAtomicLoad(&g_counter, xAtomicRelaxed));
    // Output: Counter: 400000
    return 0;
}

Spinlock (Educational)

#include <xbase/atomic.h>

typedef struct { int locked; } Spinlock;

static inline void spin_lock(Spinlock *s) {
    while (xAtomicXchg(&s->locked, 1, xAtomicAcquire) != 0) {
        // Spin
    }
}

static inline void spin_unlock(Spinlock *s) {
    xAtomicStore(&s->locked, 0, xAtomicRelease);
}

Use Cases

  1. Reference Countingmemory.h uses xAtomicAdd/xAtomicSub with SeqCst ordering for thread-safe reference count management.

  2. Lock-Free Data Structuresmpsc.h uses xAtomicXchg for wait-free push and xAtomicCasStrong for the single-element pop edge case.

  3. Event Loop Internals — The event loop uses xAtomicFetchAdd/xAtomicFetchSub with Relaxed ordering to track in-flight offload workers.

Best Practices

  • Use the weakest sufficient ordering. Relaxed for simple counters, Acquire/Release for producer-consumer patterns, SeqCst only when you need a total order visible to all threads.
  • Prefer xAtomicCasStrong over xAtomicCasWeak unless you're in a retry loop where spurious failures are acceptable (e.g., lock-free stack push).
  • Note the CAS failure ordering. Both CAS macros hardcode xAtomicRelaxed as the failure ordering. If you need stronger failure ordering, use the raw xAtomicCas macro directly.
  • Don't mix with C11 <stdatomic.h>. While both use the same underlying compiler builtins, mixing the two styles in the same translation unit can be confusing. xbase uses <stdatomic.h> in task.c for atomic_size_t but atomic.h macros everywhere else.

Comparison with Other Libraries

Featurexbase atomic.hC11 <stdatomic.h>C++ <atomic>Linux kernel atomics
StyleMacros over __atomic builtinsLanguage-level typesTemplate classInline functions + asm
Memory OrderExplicit parameterExplicit parameterExplicit parameterImplicit (varies)
TypesAny scalar (via pointer)_Atomic qualified typesstd::atomic<T>atomic_t, atomic64_t
CASxAtomicCasWeak/Strongatomic_compare_exchange_*compare_exchange_*cmpxchg
CompilerGCC ≥ 4.7, ClangC11C++11GCC (kernel)
PortabilityGCC/Clang onlyStandard C11Standard C++11Linux kernel only

Key Differentiator: xbase's atomic macros are the thinnest possible wrapper — they add naming consistency (xAtomic* prefix) and explicit ordering parameters without any abstraction overhead. They work with any scalar type via pointer, unlike C11's _Atomic qualifier which requires type annotations.

log.h — Thread-Local Log Callback

Introduction

log.h provides a per-thread, callback-based logging mechanism for xKit's internal error reporting. Each thread can register its own log callback via xLogSetCallback(); when xLog() is called, the formatted message is dispatched to that callback. If no callback is registered, messages fall back to stderr. On fatal errors, a stack backtrace is captured and abort() is called.

Design Philosophy

  1. Thread-Local Callbacks — Each thread has its own log callback and userdata, stored in __thread (thread-local storage). This avoids global locks and allows different threads to route log messages to different destinations (e.g., the xlog async logger, a test harness, or a custom handler).

  2. Minimal and Non-AllocatingxLog() formats into a fixed-size thread-local buffer (XLOG_BUF_SIZE, default 512 bytes). No heap allocation occurs during logging, making it safe to call from low-level code paths.

  3. Fatal with Backtrace — When fatal = true, xLog() captures a stack trace via xBacktrace() before calling abort(). This provides immediate diagnostic information for unrecoverable errors.

  4. Bridge to xlog — The callback mechanism is designed to integrate with the higher-level xlog module. The xlog logger registers itself as the thread's log callback, so internal xKit errors are automatically routed through the async logging pipeline.

Architecture

graph TD
    subgraph "Thread 1"
        LOG1["xLog()"] --> CB1["Custom Callback"]
    end

    subgraph "Thread 2"
        LOG2["xLog()"] --> CB2["xlog Logger"]
    end

    subgraph "Thread 3 (no callback)"
        LOG3["xLog()"] --> STDERR["stderr"]
    end

    CB1 --> FILE["Log File"]
    CB2 --> XLOG["Async Logger Pipeline"]

    style LOG1 fill:#4a90d9,color:#fff
    style LOG2 fill:#4a90d9,color:#fff
    style LOG3 fill:#4a90d9,color:#fff

Implementation Details

Thread-Local State

XDEF_STRUCT(xLogCtx) {
    xLogCallback cb;        // User callback (NULL = stderr fallback)
    void        *userdata;  // Forwarded to callback
    char         buf[XLOG_BUF_SIZE];   // Format buffer (512 bytes)
    char         bt[XLOG_BT_SIZE];     // Backtrace buffer (2048 bytes)
};

static __thread xLogCtx tl_ctx;

Each thread gets ~2.5 KB of thread-local storage for logging. The buffers are reused across calls, so there's no allocation overhead.

xLog() Flow

flowchart TD
    CALL["xLog(fatal, fmt, ...)"]
    FMT["vsnprintf → tl_ctx.buf"]
    CHECK_FATAL{"fatal?"}
    BT["xBacktraceSkip(2, bt, size)"]
    CHECK_CB{"callback set?"}
    CB["cb(msg, backtrace, userdata)"]
    STDERR["fprintf(stderr, msg)"]
    ABORT["abort()"]

    CALL --> FMT
    FMT --> CHECK_FATAL
    CHECK_FATAL -->|Yes| BT
    CHECK_FATAL -->|No| CHECK_CB
    BT --> CHECK_CB
    CHECK_CB -->|Yes| CB
    CHECK_CB -->|No| STDERR
    CB --> CHECK_FATAL2{"fatal?"}
    STDERR --> CHECK_FATAL2
    CHECK_FATAL2 -->|Yes| ABORT
    CHECK_FATAL2 -->|No| DONE["Return"]

    style ABORT fill:#e74c3c,color:#fff
    style DONE fill:#50b86c,color:#fff

Buffer Size Configuration

The format buffer size can be overridden at compile time:

#define XLOG_BUF_SIZE 1024  // Must be defined before #include <xbase/log.h>
#include <xbase/log.h>

API Reference

Macros

MacroDefaultDescription
XLOG_BUF_SIZE512Format buffer size in bytes. Override before including the header.

Types

TypeDescription
xLogCallbackvoid (*)(const char *msg, const char *backtrace, void *userdata) — Log callback. backtrace is non-NULL only on fatal.

Functions

FunctionSignatureDescriptionThread Safety
xLogSetCallbackvoid xLogSetCallback(xLogCallback cb, void *userdata)Register (or clear with NULL) the current thread's log callback.Thread-local (each thread sets its own)
xLogvoid xLog(bool fatal, const char *fmt, ...)Format and dispatch a log message. If fatal, captures backtrace and calls abort().Thread-local (uses calling thread's callback)

Usage Examples

Basic Logging with Custom Callback

#include <stdio.h>
#include <xbase/log.h>

static void my_log_handler(const char *msg, const char *backtrace,
                            void *userdata) {
    FILE *f = (FILE *)userdata;
    fprintf(f, "[MyApp] %s\n", msg);
    if (backtrace) {
        fprintf(f, "Stack trace:\n%s", backtrace);
    }
}

int main(void) {
    // Route this thread's logs to a file
    FILE *logfile = fopen("app.log", "w");
    xLogSetCallback(my_log_handler, logfile);

    xLog(false, "Application started, version %d.%d", 1, 0);
    xLog(false, "Processing %d items", 42);

    // Clear callback (revert to stderr)
    xLogSetCallback(NULL, NULL);
    xLog(false, "This goes to stderr");

    fclose(logfile);
    return 0;
}

Fatal Error with Backtrace

#include <xbase/log.h>

void dangerous_operation(void) {
    // This will print the message, capture a backtrace, and abort()
    xLog(true, "Unrecoverable error: corrupted state detected");
    // Never reaches here
}

Use Cases

  1. xKit Internal Error Reporting — All xKit modules use xLog() to report internal errors (e.g., allocation failures, invalid states). By registering a callback, applications can capture these messages in their logging pipeline.

  2. xlog Integration — The xlog module registers its logger as the thread's callback via xLogSetCallback(), routing all internal xKit messages through the async logging system.

  3. Test Frameworks — Test harnesses can register a callback that captures log messages for assertion, rather than letting them go to stderr.

Best Practices

  • Register callbacks early. Set up xLogSetCallback() before calling any xKit functions to ensure all messages are captured.
  • Don't block in callbacks. The callback runs synchronously on the calling thread. Blocking delays the caller. For async logging, use the xlog module.
  • Handle NULL backtrace. The backtrace parameter is NULL for non-fatal messages. Always check before using it.
  • Be aware of buffer truncation. Messages longer than XLOG_BUF_SIZE are truncated. Increase the size at compile time if needed.

Comparison with Other Libraries

Featurexbase log.hsyslogfprintf(stderr)GLib g_log
CallbackPer-threadGlobal handlerN/AGlobal handler
Thread SafetyThread-local (no locks)Thread-safe (kernel)Thread-safe (stdio lock)Thread-safe (global lock)
BacktraceBuilt-in on fatalNoNoOptional (G_DEBUG)
AllocationNone (stack buffer)None (kernel)None (stdio buffer)Heap (GString)
Fatal Handlingabort() with backtraceN/AN/Aabort() (G_LOG_FLAG_FATAL)
CustomizationPer-thread callbackopenlog()Redirect fdg_log_set_handler()

Key Differentiator: xbase's log is designed as a lightweight internal error channel, not a full logging framework. Its per-thread callback design avoids global locks and integrates naturally with the xlog async logger for production use.

backtrace.h — Platform-Adaptive Stack Backtrace

Introduction

backtrace.h captures the current call stack and formats it into a human-readable multi-line string. The unwinding backend is selected at build time with the following priority: libunwind > execinfo (macOS/glibc) > stub (unsupported platforms). It is used internally by xLog() to provide stack traces on fatal errors.

Design Philosophy

  1. Build-Time Backend Selection — The backend is chosen via CMake-detected macros (XK_HAS_LIBUNWIND, XK_HAS_EXECINFO). This avoids runtime overhead and ensures the best available unwinder is used on each platform.

  2. Graceful Degradation — On platforms without libunwind or execinfo, a stub backend returns a "not supported" message rather than crashing. This ensures xBacktrace() is always safe to call.

  3. Automatic Frame Skipping — Internal frames (xBacktracexBacktraceSkipbt_capture) are automatically skipped so the output starts from the caller's perspective. The skip parameter allows additional frames to be skipped (useful when called through wrapper functions like xLog).

  4. Buffer-Based Output — The caller provides a buffer; no heap allocation occurs. This makes it safe to call from signal handlers, fatal error paths, and low-memory situations.

Architecture

graph TD
    API["xBacktrace() / xBacktraceSkip()"]
    SELECT{"Build-time selection"}
    LIBUNWIND["libunwind<br/>unw_step() loop"]
    EXECINFO["execinfo<br/>backtrace() + backtrace_symbols()"]
    STUB["stub<br/>'not supported' message"]
    BUF["User buffer<br/>(formatted output)"]

    API --> SELECT
    SELECT -->|XK_HAS_LIBUNWIND| LIBUNWIND
    SELECT -->|XK_HAS_EXECINFO| EXECINFO
    SELECT -->|fallback| STUB
    LIBUNWIND --> BUF
    EXECINFO --> BUF
    STUB --> BUF

    style LIBUNWIND fill:#50b86c,color:#fff
    style EXECINFO fill:#4a90d9,color:#fff
    style STUB fill:#f5a623,color:#fff

Implementation Details

Backend Selection

BackendMacroPlatformQuality
libunwindXK_HAS_LIBUNWINDLinux (with libunwind installed)Best — accurate unwinding, symbol + offset
execinfoXK_HAS_EXECINFOmacOS, Linux (glibc)Good — requires -rdynamic on Linux for symbols
stub(fallback)AnyMinimal — returns "not supported" message

Output Format

Each frame is formatted as:

#0 0x7fff8a1b2c3d symbol_name+0x1a
#1 0x7fff8a1b2c3d another_function+0x42
#2 0x7fff8a1b2c3d <unknown>
  • #N — Frame number (0 = most recent)
  • 0xADDR — Instruction pointer address
  • symbol+offset — Function name and offset (if available)
  • <unknown> — When symbol resolution fails

Frame Skipping

Call stack:
  bt_capture()         ← INTERNAL_SKIP (2 frames)
  xBacktraceSkip()     ← INTERNAL_SKIP
  xLog()               ← user skip = 2 (from xLog)
  user_function()      ← first visible frame
  main()

xBacktrace() calls xBacktraceSkip(0, ...), which adds INTERNAL_SKIP = 2 to skip its own frames. xLog() calls xBacktraceSkip(2, ...) to also skip xLog and xLogSetCallback frames.

libunwind Backend

Uses unw_getcontext()unw_init_local()unw_step() loop. For each frame:

  • unw_get_reg(UNW_REG_IP) — Get instruction pointer
  • unw_get_proc_name() — Get symbol name and offset

execinfo Backend

Uses backtrace() to capture frame addresses, then backtrace_symbols() to resolve names. On Linux, link with -rdynamic to export symbols for resolution.

API Reference

Functions

FunctionSignatureDescriptionThread Safety
xBacktraceint xBacktrace(char *buf, size_t size)Capture the call stack into buf. Equivalent to xBacktraceSkip(0, buf, size).Thread-safe (uses only local/stack state)
xBacktraceSkipint xBacktraceSkip(int skip, char *buf, size_t size)Capture the call stack, skipping skip additional frames beyond internal frames.Thread-safe

Parameters

ParameterDescription
skipNumber of additional frames to skip (0 = no extra skipping)
bufDestination buffer. May be NULL (returns 0).
sizeSize of buf in bytes.

Return Value

Number of bytes written (excluding trailing \0), or 0 if buf is NULL or size is 0.

Usage Examples

Capture and Print Stack Trace

#include <stdio.h>
#include <xbase/backtrace.h>

void foo(void) {
    char buf[4096];
    int n = xBacktrace(buf, sizeof(buf));
    if (n > 0) {
        printf("Stack trace:\n%s", buf);
    }
}

void bar(void) { foo(); }

int main(void) {
    bar();
    return 0;
}

Output (with execinfo on macOS):

Stack trace:
#0 0x100003f20 foo+0x20
#1 0x100003f80 bar+0x10
#2 0x100003fa0 main+0x10

Skip Wrapper Frames

#include <xbase/backtrace.h>

// Custom error reporter that skips its own frame
void report_error(const char *msg) {
    char bt[2048];
    xBacktraceSkip(1, bt, sizeof(bt)); // Skip report_error itself
    fprintf(stderr, "Error: %s\nBacktrace:\n%s", msg, bt);
}

Use Cases

  1. Fatal Error DiagnosticsxLog() captures a backtrace on fatal errors, providing immediate context for debugging crashes.

  2. Debug Assertions — Custom assertion macros can include xBacktrace() to show where the assertion failed.

  3. Memory Leak Detection — Record allocation backtraces to identify where leaked objects were created.

Best Practices

  • Provide a large enough buffer. 4096 bytes is usually sufficient for 20-30 frames. The output is truncated (not corrupted) if the buffer is too small.
  • Link with -rdynamic on Linux. Without it, the execinfo backend shows only addresses, not symbol names.
  • Install libunwind for best results on Linux. It provides more accurate unwinding than execinfo, especially through optimized code and signal handlers.
  • Don't call from signal handlers with execinfo. backtrace_symbols() calls malloc(), which is not async-signal-safe. libunwind is safer in this context.

Comparison with Other Libraries

Featurexbase backtrace.hglibc backtrace()libunwindBoost.StacktraceWindows CaptureStackBackTrace
PlatformmacOS + Linux + stubLinux (glibc)Linux + macOSCross-platformWindows
AccuracyBackend-dependentGood (glibc)ExcellentBackend-dependentGood
Symbol ResolutionBuilt-inbacktrace_symbols()unw_get_proc_name()Backend-dependentSymFromAddr()
AllocationNone (user buffer)malloc() for symbolsNoneHeapNone
Signal Safetylibunwind: yes, execinfo: noNo (malloc)YesNoYes
Frame SkippingBuilt-in (skip param)ManualManualManualFramesToSkip param

Key Differentiator: xbase's backtrace provides a simple, buffer-based API with automatic frame skipping and graceful degradation across platforms. It's designed for integration into error reporting paths where heap allocation is undesirable.

socket.h — Async Socket

Introduction

socket.h provides an async socket abstraction built on top of xEventLoop. It wraps the POSIX socket API with automatic non-blocking setup, event loop registration, and idle-timeout support. When a socket becomes readable, writable, or times out, a single unified callback is invoked with the appropriate event mask.

Design Philosophy

  1. Thin Wrapper, Not a FrameworkxSocket adds just enough abstraction to eliminate boilerplate (non-blocking setup, FD_CLOEXEC, event registration) without hiding the underlying fd. You can always retrieve the raw fd via xSocketFd() for direct system calls.

  2. Idle-Timeout Semantics — Read and write timeouts are reset on every corresponding I/O event, implementing idle-timeout behavior. This is ideal for detecting dead connections: if no data arrives within the timeout period, the callback fires with xEvent_Timeout.

  3. Unified Callback — A single xSocketFunc callback handles all events (read, write, timeout). The mask parameter tells you what happened, and the xEvent_Timeout flag is OR'd with xEvent_Read or xEvent_Write to indicate which direction timed out.

  4. Lifecycle Tied to Event Loop — A socket is created and destroyed in the context of an event loop. xSocketDestroy() cancels timers, removes the event source, closes the fd, and frees the handle in one call.

Architecture

graph TD
    APP["Application"] -->|"xSocketCreate()"| SOCKET["xSocket"]
    SOCKET -->|"xEventAdd()"| LOOP["xEventLoop"]
    LOOP -->|"I/O ready"| TRAMP["trampoline()"]
    TRAMP -->|"reset timers"| TIMER["Timer Heap"]
    TRAMP -->|"forward"| CB["callback(sock, mask, userp)"]
    TIMER -->|"timeout"| TIMEOUT_CB["timeout_cb()"]
    TIMEOUT_CB -->|"xEvent_Timeout"| CB

    style SOCKET fill:#4a90d9,color:#fff
    style LOOP fill:#f5a623,color:#fff
    style CB fill:#50b86c,color:#fff

Implementation Details

Internal Structure

struct xSocket_ {
    int              fd;               // Underlying file descriptor
    xEventLoop       loop;             // Bound event loop
    xEventSource     source;           // Registered event source
    xEventMask       mask;             // Current event mask
    xSocketFunc      callback;         // User callback
    void            *userp;            // User data
    xEventTimer      read_timer;       // Read idle timeout timer
    xEventTimer      write_timer;      // Write idle timeout timer
    int              read_timeout_ms;  // Read timeout setting (0 = disabled)
    int              write_timeout_ms; // Write timeout setting (0 = disabled)
};

Trampoline Pattern

The socket registers an internal trampoline() function as the event callback with the event loop. This trampoline:

  1. Resets idle timers — On xEvent_Read, cancels and re-arms the read timer. On xEvent_Write, cancels and re-arms the write timer.
  2. Forwards to user callback — Calls callback(sock, mask, userp) with the original event mask.

This ensures idle timers are always reset transparently, without requiring the user to manage them manually.

Socket Creation

xSocketCreate() performs these steps atomically:

  1. socket(family, type, protocol) — On Linux/BSD with SOCK_CLOEXEC | SOCK_NONBLOCK, both flags are set in one syscall. On other platforms, fcntl() is used as a fallback.
  2. xEventAdd(loop, fd, mask, trampoline, socket) — Registers with the event loop.
  3. Returns the opaque xSocket handle.

Timeout Mechanism

sequenceDiagram
    participant App
    participant Socket as xSocket
    participant L as xEventLoop
    participant Timer as Timer Heap

    App->>Socket: xSocketSetTimeout(sock, 5000, 3000)
    Socket->>Timer: arm read timer (5s)
    Socket->>Timer: arm write timer (3s)

    Note over L: Data arrives on fd
    L->>Socket: trampoline(fd, xEvent_Read)
    Socket->>Timer: cancel + re-arm read timer (5s)
    Socket->>App: callback(sock, xEvent_Read)

    Note over Timer: 5 seconds of silence...
    Timer->>Socket: read_timeout_cb()
    Socket->>App: callback(sock, xEvent_Timeout | xEvent_Read)

API Reference

Types

TypeDescription
xSocketOpaque handle to an async socket
xSocketFuncvoid (*)(xSocket sock, xEventMask mask, void *arg) — Socket event callback

Functions

FunctionSignatureDescriptionThread Safety
xSocketCreatexSocket xSocketCreate(xEventLoop loop, int family, int type, int protocol, xEventMask mask, xSocketFunc callback, void *userp)Create a non-blocking socket and register with the event loop.Not thread-safe
xSocketDestroyvoid xSocketDestroy(xEventLoop loop, xSocket sock)Cancel timers, remove from event loop, close fd, free handle. Safe with NULL.Not thread-safe
xSocketSetMaskxErrno xSocketSetMask(xEventLoop loop, xSocket sock, xEventMask mask)Change the watched event mask.Not thread-safe
xSocketSetTimeoutxErrno xSocketSetTimeout(xSocket sock, int read_timeout_ms, int write_timeout_ms)Set idle timeouts. Pass 0 to cancel. Replaces previous settings.Not thread-safe
xSocketFdint xSocketFd(xSocket sock)Return the underlying fd, or -1 if NULL.Thread-safe (read-only)
xSocketMaskxEventMask xSocketMask(xSocket sock)Return the current event mask, or 0 if NULL.Thread-safe (read-only)

Callback Mask Values

MaskMeaning
xEvent_ReadSocket is readable
xEvent_WriteSocket is writable
xEvent_Timeout | xEvent_ReadRead idle timeout fired
xEvent_Timeout | xEvent_WriteWrite idle timeout fired

Usage Examples

TCP Echo Client with Timeout

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <xbase/socket.h>

static xEventLoop g_loop;

static void on_socket(xSocket sock, xEventMask mask, void *arg) {
    (void)arg;

    if (mask & xEvent_Timeout) {
        printf("Timeout on %s\n",
               (mask & xEvent_Read) ? "read" : "write");
        xSocketDestroy(g_loop, sock);
        xEventLoopStop(g_loop);
        return;
    }

    if (mask & xEvent_Read) {
        char buf[1024];
        ssize_t n;
        while ((n = read(xSocketFd(sock), buf, sizeof(buf))) > 0) {
            printf("Received: %.*s\n", (int)n, buf);
        }
    }

    if (mask & xEvent_Write) {
        const char *msg = "Hello, server!";
        write(xSocketFd(sock), msg, strlen(msg));
        // Switch to read-only after sending
        xSocketSetMask(g_loop, sock, xEvent_Read);
    }
}

int main(void) {
    g_loop = xEventLoopCreate();

    xSocket sock = xSocketCreate(g_loop, AF_INET, SOCK_STREAM, 0,
                                  xEvent_Write, on_socket, NULL);
    if (!sock) return 1;

    // Set 5-second read idle timeout
    xSocketSetTimeout(sock, 5000, 0);

    // Connect (non-blocking)
    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port   = htons(8080),
    };
    inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
    connect(xSocketFd(sock), (struct sockaddr *)&addr, sizeof(addr));

    xEventLoopRun(g_loop);
    xEventLoopDestroy(g_loop);
    return 0;
}

UDP Receiver with Idle Timeout

#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <xbase/socket.h>

static void on_udp(xSocket sock, xEventMask mask, void *arg) {
    xEventLoop loop = (xEventLoop)arg;

    if (mask & xEvent_Timeout) {
        printf("No data for 10 seconds, shutting down.\n");
        xSocketDestroy(loop, sock);
        xEventLoopStop(loop);
        return;
    }

    if (mask & xEvent_Read) {
        char buf[65536];
        ssize_t n;
        while ((n = read(xSocketFd(sock), buf, sizeof(buf))) > 0) {
            printf("UDP: %.*s\n", (int)n, buf);
        }
    }
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xSocket sock = xSocketCreate(loop, AF_INET, SOCK_DGRAM, 0,
                                  xEvent_Read, on_udp, loop);

    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port   = htons(9999),
        .sin_addr.s_addr = INADDR_ANY,
    };
    bind(xSocketFd(sock), (struct sockaddr *)&addr, sizeof(addr));

    // 10-second read idle timeout
    xSocketSetTimeout(sock, 10000, 0);

    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

Use Cases

  1. Network Servers — Create listening sockets, accept connections, and manage each client with its own xSocket + idle timeout. Dead connections are automatically detected.

  2. Protocol Clients — Build async clients (HTTP, Redis, etc.) that connect, send requests, and wait for responses with timeout protection.

  3. Real-Time Data Feeds — Monitor UDP multicast sockets with idle timeouts to detect feed outages.

Best Practices

  • Always drain in edge-triggered mode. Since the underlying event loop is edge-triggered, read/write until EAGAIN in every callback.
  • Use idle timeouts for connection health. Set read_timeout_ms to detect dead peers. The timeout resets automatically on each read event.
  • Destroy sockets before the event loop. xSocketDestroy() calls xEventDel() and xEventLoopTimerCancel(), which require a valid event loop.
  • Check the timeout direction. When xEvent_Timeout fires, check mask & xEvent_Read vs. mask & xEvent_Write to know which direction timed out.
  • Don't close the fd manually. xSocketDestroy() closes it for you. Closing it separately leads to double-close bugs.

Comparison with Other Libraries

Featurexbase socket.hPOSIX socket APIlibuv uv_tcp_tBoost.Asio
Non-blocking SetupAutomatic (SOCK_NONBLOCK + FD_CLOEXEC)Manual (fcntl)AutomaticAutomatic
Event RegistrationAutomatic (via xEventLoop)Manual (epoll_ctl / kevent)AutomaticAutomatic
Idle TimeoutBuilt-in (xSocketSetTimeout)Manual (timer + bookkeeping)Manual (uv_timer)Manual (deadline_timer)
Callback StyleSingle unified callback with maskN/A (blocking or manual poll)Separate read/write callbacksSeparate handlers
Raw fd AccessxSocketFd()Directuv_fileno()native_handle()
Buffered I/ONo (raw fd)NoYes (uv_read_start)Yes (async_read)
PlatformmacOS + LinuxPOSIXCross-platformCross-platform

Key Differentiator: xbase's socket abstraction is intentionally thin — it handles the boilerplate (non-blocking, event registration, idle timeout) but leaves data reading/writing to the caller via the raw fd. This gives maximum flexibility without imposing a buffering strategy.

io.h — Abstract I/O Interfaces

Introduction

io.h defines four lightweight I/O interfaces — xReader, xWriter, xSeeker, xCloser — inspired by Go's io.Reader / io.Writer / io.Seeker / io.Closer. Each interface is a small struct containing a function pointer and an opaque void *ctx, making it trivial to adapt any object that provides the matching function signature.

On top of these interfaces, io.h provides a set of convenience functions (xRead, xReadFull, xReadAll, xWrite, xWritev, xSeek, xClose) that operate generically on any implementation, enabling code reuse across TCP connections, TLS streams, file descriptors, in-memory buffers, and more.

Design Philosophy

  1. Value-Type Interfaces — Each interface is a plain struct (function pointer + context), not a heap-allocated object. They are cheap to copy, pass by value, and require no memory management.

  2. POSIX Semantics — Function signatures mirror their POSIX counterparts: read(2), writev(2), lseek(2), close(2). This makes the learning curve near-zero for C developers.

  3. Composable Helpers — Higher-level functions like xReadFull and xReadAll are built on top of xReader, so any object that provides a reader automatically gains these capabilities.

  4. Zero-Initialized = Invalid — A zero-initialized struct (all NULL) is treated as "not set". Convenience functions can detect this and return an error instead of crashing.

Architecture

graph TD
    subgraph "Interfaces"
        R["xReader<br/>ssize_t read(ctx, buf, len)"]
        W["xWriter<br/>ssize_t writev(ctx, iov, iovcnt)"]
        S["xSeeker<br/>off_t seek(ctx, offset, whence)"]
        C["xCloser<br/>int close(ctx)"]
    end

    subgraph "Convenience Functions"
        XR["xRead"]
        XRF["xReadFull"]
        XRA["xReadAll"]
        XW["xWrite"]
        XWV["xWritev"]
        XS["xSeek"]
        XC["xClose"]
    end

    subgraph "Implementations"
        TCP["xTcpConn<br/>xTcpConnReader / xTcpConnWriter"]
        IOB["xIOBuffer<br/>(read/writev funcs)"]
        FD["File Descriptor<br/>(custom wrapper)"]
    end

    XR --> R
    XRF --> R
    XRA --> R
    XW --> W
    XWV --> W
    XS --> S
    XC --> C

    TCP -.->|"adapts to"| R
    TCP -.->|"adapts to"| W
    IOB -.->|"adapts to"| R
    IOB -.->|"adapts to"| W
    FD -.->|"adapts to"| R
    FD -.->|"adapts to"| W

    style R fill:#4a90d9,color:#fff
    style W fill:#4a90d9,color:#fff
    style S fill:#4a90d9,color:#fff
    style C fill:#4a90d9,color:#fff
    style XRF fill:#50b86c,color:#fff
    style XRA fill:#50b86c,color:#fff

Implementation Details

Interface Structs

Each interface is a two-field struct:

InterfaceFunction PointerSemantics
xReaderssize_t (*read)(void *ctx, void *buf, size_t len)Returns bytes read, 0 on EOF, -1 on error
xWriterssize_t (*writev)(void *ctx, const struct iovec *iov, int iovcnt)Returns bytes written, -1 on error
xSeekeroff_t (*seek)(void *ctx, off_t offset, int whence)Returns resulting offset, -1 on error
xCloserint (*close)(void *ctx)Returns 0 on success, -1 on failure

xReadFull — Retry Logic

xReadFull loops calling r.read until exactly len bytes are read or EOF is reached. It automatically retries on EAGAIN and EINTR, making it suitable for both blocking and non-blocking file descriptors:

while (total < len):
    n = r.read(ctx, buf + total, len - total)
    if n > 0:  total += n
    if n == 0: break          // EOF
    if n == -1:
        if EAGAIN or EINTR: continue
        else: return -1       // real error
return total

xReadAll — Dynamic Buffer Growth

xReadAll reads until EOF into a dynamically allocated buffer. It starts with a 4096-byte allocation and doubles the capacity each time the buffer fills up:

cap = 4096, buf = malloc(cap)
loop:
    if total == cap: realloc(buf, cap * 2)
    n = r.read(ctx, buf + total, cap - total)
    if n > 0:  total += n
    if n == 0: *out = buf, *out_len = total, return 0
    if n == -1:
        if EAGAIN or EINTR: continue
        else: free(buf), return -1

The caller is responsible for freeing the returned buffer with free().

xWrite — Single Buffer Convenience

xWrite wraps a contiguous buffer into a single struct iovec and delegates to w.writev, avoiding the need for callers to construct iovec arrays for simple writes:

ssize_t xWrite(xWriter w, const void *buf, size_t len) {
    struct iovec iov = { .iov_base = (void *)buf, .iov_len = len };
    return w.writev(w.ctx, &iov, 1);
}

API Reference

Types

TypeDescription
xReaderAbstract reader — { ssize_t (*read)(void*, void*, size_t), void *ctx }
xWriterAbstract writer — { ssize_t (*writev)(void*, const struct iovec*, int), void *ctx }
xSeekerAbstract seeker — { off_t (*seek)(void*, off_t, int), void *ctx }
xCloserAbstract closer — { int (*close)(void*), void *ctx }

Functions

FunctionSignatureDescription
xReadssize_t xRead(xReader r, void *buf, size_t len)Single read; returns bytes read, 0 on EOF, -1 on error
xWritessize_t xWrite(xWriter w, const void *buf, size_t len)Write a contiguous buffer (wraps into single iovec)
xWritevssize_t xWritev(xWriter w, const struct iovec *iov, int iovcnt)Scatter-gather write
xSeekoff_t xSeek(xSeeker s, off_t offset, int whence)Reposition offset (SEEK_SET / SEEK_CUR / SEEK_END)
xCloseint xClose(xCloser c)Close the underlying resource
xReadFullssize_t xReadFull(xReader r, void *buf, size_t len)Read exactly len bytes, retrying on partial reads and EAGAIN/EINTR
xReadAllint xReadAll(xReader r, void **out, size_t *out_len)Read until EOF into a malloc'd buffer; caller must free(*out)

Usage Examples

Creating a Custom Reader

#include <xbase/io.h>
#include <unistd.h>

// Adapt a file descriptor into an xReader
static ssize_t fd_read(void *ctx, void *buf, size_t len) {
    int fd = (int)(intptr_t)ctx;
    return read(fd, buf, len);
}

xReader make_fd_reader(int fd) {
    xReader r;
    r.read = fd_read;
    r.ctx  = (void *)(intptr_t)fd;
    return r;
}

Reading Exactly N Bytes

#include <xbase/io.h>

void read_header(xReader r) {
    char header[64];
    ssize_t n = xReadFull(r, header, sizeof(header));
    if (n < 0) {
        // error
    } else if ((size_t)n < sizeof(header)) {
        // EOF before full header
    } else {
        // got all 64 bytes
    }
}

Reading All Data Until EOF

#include <xbase/io.h>
#include <stdlib.h>

void read_body(xReader r) {
    void  *data;
    size_t data_len;

    if (xReadAll(r, &data, &data_len) == 0) {
        // process data (data_len bytes at data)
        free(data);
    } else {
        // error
    }
}

Using with xTcpConn

xTcpConn (from <xnet/tcp.h>) provides adapter functions that return xReader and xWriter bound to the connection's transport layer. This allows TCP connections to be used with all generic I/O helpers:

#include <xbase/io.h>
#include <xnet/tcp.h>

void handle_connection(xTcpConn conn) {
    // Get I/O adapters from the TCP connection
    xReader r = xTcpConnReader(conn);
    xWriter w = xTcpConnWriter(conn);

    // Read a fixed-size header
    char header[16];
    ssize_t n = xReadFull(r, header, sizeof(header));
    if (n < (ssize_t)sizeof(header)) return;

    // Read the entire body until the peer closes
    void  *body;
    size_t body_len;
    if (xReadAll(r, &body, &body_len) != 0) return;

    // Echo back through the generic writer
    xWrite(w, body, body_len);
    free(body);
}

Scatter-Gather Write

#include <xbase/io.h>

void send_http_response(xWriter w) {
    const char *header = "HTTP/1.1 200 OK\r\nContent-Length: 5\r\n\r\n";
    const char *body   = "Hello";

    struct iovec iov[2] = {
        { .iov_base = (void *)header, .iov_len = strlen(header) },
        { .iov_base = (void *)body,   .iov_len = 5 },
    };

    xWritev(w, iov, 2);
}

Integration with xTcpConn

xTcpConn provides two adapter functions that bridge the TCP connection to the generic I/O interfaces:

FunctionReturnsDescription
xTcpConnReader(conn)xReaderReader bound to transport.read — equivalent to xTcpConnRecv
xTcpConnWriter(conn)xWriterWriter bound to transport.writev — equivalent to xTcpConnSendIov

These adapters are zero-allocation: they copy the function pointer and context from the connection's internal xTransport into a stack-allocated struct. The returned interfaces are valid as long as the connection (and its transport) remains alive.

Why no xCloser adapter? xTcpConnClose() requires an xEventLoop parameter to properly unregister the socket from the event loop, which does not fit the int (*close)(void *ctx) signature.

Best Practices

  • Prefer xReadFull over manual loops when you need an exact number of bytes. It handles EAGAIN, EINTR, and partial reads correctly.
  • Always free() the buffer from xReadAll on success. On error, the function cleans up internally.
  • Use xWrite for simple writes, xWritev for multi-buffer writes. xWrite is a thin wrapper that constructs a single iovec — no performance penalty.
  • Check for zero-initialized interfaces before passing them to helpers. If xTcpConnReader(NULL) returns a zero struct, calling xRead on it will dereference a NULL function pointer.
  • Obtain adapters once, use many times. Since xTcpConnReader / xTcpConnWriter are value types, you can call them once at the start of a handler and reuse the result throughout.

Comparison with Other Libraries

Featurexbase io.hGo io.Reader/WriterPOSIX read/writeC++ std::iostream
AbstractionStruct (fn ptr + ctx)Interface (vtable)Raw syscallClass hierarchy
AllocationZero (stack value)Heap (interface value)N/AHeap (stream object)
ComposabilityVia helper functionsVia io.Copy, io.ReadAll, etc.Manual loopsVia stream operators
Scatter-GatherBuilt-in (xWritev)No (use io.MultiWriter)writev(2)No
Read-Until-EOFxReadAll (malloc'd buffer)io.ReadAll ([]byte)Manual loopstd::istreambuf_iterator
Error ModelReturn value (-1 + errno)(n, error) tupleReturn value (-1 + errno)Stream state flags

xbuf — Buffer Toolkit

Introduction

xbuf is xKit's buffer module, providing three distinct buffer types optimized for different use cases: a linear auto-growing buffer, a fixed-size ring buffer, and a reference-counted block-chain I/O buffer. Together they cover the full spectrum of buffering needs — from simple byte accumulation to zero-copy network I/O.

Design Philosophy

  1. One Buffer Does Not Fit All — Rather than a single "universal" buffer, xbuf offers three specialized types. Each makes different trade-offs between simplicity, performance, and memory efficiency.

  2. Flexible Array Member Layout — Both xBuffer and xRingBuffer allocate header + data in a single malloc() call using C99 flexible array members. This eliminates pointer indirection and improves cache locality.

  3. Reference-Counted Block SharingxIOBuffer uses reference-counted blocks that can be shared across multiple buffers. This enables zero-copy split and append operations critical for high-performance network protocols.

  4. I/O Integration — All three types provide ReadFd/WriteFd helpers that handle EINTR retries and scatter-gather I/O (readv/writev), making them ready for event-driven network programming.

Architecture

graph TD
    subgraph "xbuf Module"
        BUF["xBuffer<br/>Linear auto-growing<br/>Single contiguous allocation"]
        RING["xRingBuffer<br/>Fixed-size circular<br/>Power-of-2 masking"]
        IO["xIOBuffer<br/>Block-chain<br/>Reference-counted"]
    end

    subgraph "Shared Infrastructure"
        POOL["Block Pool<br/>Treiber stack freelist"]
        ATOMIC["xbase/atomic.h<br/>Lock-free operations"]
    end

    IO --> POOL
    POOL --> ATOMIC

    subgraph "I/O Layer"
        READ["read() / readv()"]
        WRITE["write() / writev()"]
    end

    BUF --> READ
    BUF --> WRITE
    RING --> READ
    RING --> WRITE
    IO --> READ
    IO --> WRITE

    style BUF fill:#4a90d9,color:#fff
    style RING fill:#f5a623,color:#fff
    style IO fill:#50b86c,color:#fff

Sub-Module Overview

HeaderTypeDescriptionDoc
buf.hxBufferLinear auto-growing byte buffer with flexible array member layoutbuf.md
ring.hxRingBufferFixed-size circular buffer with power-of-2 bitmask indexingring.md
io.hxIOBufferReference-counted block-chain I/O buffer with zero-copy operationsio.md

How to Choose

CriterionxBufferxRingBufferxIOBuffer
Memory layoutContiguousContiguous (circular)Non-contiguous (block chain)
GrowthAuto-growing (2x realloc)Fixed size (never grows)Auto-growing (new blocks)
Best forAccumulating variable-length dataFixed-capacity producer-consumerHigh-throughput network I/O
Zero-copy splitNoNoYes
Zero-copy appendNoNoYes (between xIOBuffers)
Scatter-gather I/ONo (single buffer)Yes (up to 2 iovecs)Yes (N iovecs)
Memory overheadMinimal (1 allocation)Minimal (1 allocation)Per-block overhead + ref array
Thread safetyNot thread-safeNot thread-safeBlock pool is thread-safe

Decision Guide

Need to accumulate data of unknown size?
  → xBuffer (simple, auto-growing)

Need a fixed-capacity FIFO between producer and consumer?
  → xRingBuffer (no allocation after creation)

Need zero-copy operations or scatter-gather I/O for networking?
  → xIOBuffer (block-chain with reference counting)

Quick Start

#include <stdio.h>
#include <xbuf/buf.h>
#include <xbuf/ring.h>
#include <xbuf/io.h>

int main(void) {
    // 1. Linear buffer: accumulate data
    xBuffer buf = xBufferCreate(256);
    xBufferAppend(&buf, "Hello, ", 7);
    xBufferAppend(&buf, "xbuf!", 5);
    printf("buf: %.*s\n", (int)xBufferLen(buf), (const char *)xBufferData(buf));
    xBufferDestroy(buf);

    // 2. Ring buffer: fixed-capacity FIFO
    xRingBuffer ring = xRingBufferCreate(1024);
    xRingBufferWrite(ring, "circular", 8);
    char out[16];
    size_t n = xRingBufferRead(ring, out, sizeof(out));
    printf("ring: %.*s\n", (int)n, out);
    xRingBufferDestroy(ring);

    // 3. IO buffer: block-chain with zero-copy
    xIOBuffer io;
    xIOBufferInit(&io);
    xIOBufferAppend(&io, "block-chain I/O", 15);
    char linear[64];
    xIOBufferCopyTo(&io, linear);
    printf("io: %.*s\n", (int)xIOBufferLen(&io), linear);
    xIOBufferDeinit(&io);

    return 0;
}

Relationship with Other Modules

  • xbasexIOBuffer uses atomic.h for lock-free block pool management and reference counting.
  • xhttp — The HTTP client (client.h) uses xIOBuffer for response body accumulation and SSE stream parsing.
  • xlog — The async logger (logger.h) may use xBuffer for log message formatting.

buf.h — Linear Auto-Growing Buffer

Introduction

buf.h provides xBuffer, a simple contiguous byte buffer that automatically grows when more space is needed. It maintains separate read and write positions, supporting efficient append-and-consume patterns. The buffer header and data area are allocated in a single malloc() call using a C99 flexible array member, avoiding an extra pointer indirection.

Design Philosophy

  1. Single Allocation — Header and data live in one contiguous block (struct + flexible array member). This means one malloc(), one free(), and excellent cache locality.

  2. Handle Indirection — Because realloc() may relocate the entire object, write APIs take xBuffer *bufp (pointer to handle) so the caller's handle stays valid after growth.

  3. Compact Before Grow — When the buffer needs more space, it first tries to compact (slide unread data to the front) before resorting to realloc(). This reclaims consumed space without allocation.

  4. 2x Growth — When reallocation is necessary, capacity doubles each time, providing amortized O(1) append.

Architecture

graph LR
    subgraph "xBuffer Lifecycle"
        CREATE["xBufferCreate(cap)"] --> USE["Append / Read / Consume"]
        USE --> GROW{"Need more space?"}
        GROW -->|Compact| USE
        GROW -->|Realloc 2x| USE
        USE --> DESTROY["xBufferDestroy()"]
    end

    style CREATE fill:#4a90d9,color:#fff
    style DESTROY fill:#e74c3c,color:#fff

Implementation Details

Memory Layout

Single malloc() allocation:
┌──────────────────┬──────────────────────────────────────────┐
│  xBuffer_ header │  data[cap]  (flexible array member)      │
│  rpos, wpos, cap │                                          │
└──────────────────┴──────────────────────────────────────────┘
                    ↑          ↑                    ↑
                    data+rpos  data+wpos            data+cap
                    │←readable→│←────writable──────→│

Internal Structure

XDEF_STRUCT(xBuffer_) {
    size_t rpos;   // Read position (start of unread data)
    size_t wpos;   // Write position (end of unread data)
    size_t cap;    // Total data capacity
    char   data[]; // Flexible array member
};

Growth Strategy

flowchart TD
    APPEND["xBufferAppend(bufp, data, len)"]
    CHECK{"wpos + len <= cap?"}
    WRITE["memcpy at wpos, advance wpos"]
    COMPACT{"rpos > 0 AND<br/>unread + len <= cap?"}
    MEMMOVE["memmove data to front<br/>rpos=0, wpos=unread"]
    REALLOC["realloc(cap * 2)"]
    UPDATE["Update *bufp"]

    APPEND --> CHECK
    CHECK -->|Yes| WRITE
    CHECK -->|No| COMPACT
    COMPACT -->|Yes| MEMMOVE --> WRITE
    COMPACT -->|No| REALLOC --> UPDATE --> WRITE

    style WRITE fill:#50b86c,color:#fff
    style REALLOC fill:#f5a623,color:#fff

Operations and Complexity

OperationTime ComplexityNotes
xBufferAppendAmortized O(1) per byteMay trigger compact or realloc
xBufferConsumeO(1)Advances read position
xBufferCompactO(n)memmove of unread data
xBufferDataO(1)Returns data + rpos
xBufferLenO(1)Returns wpos - rpos
xBufferReadFdO(1)Single read() syscall
xBufferWriteFdO(1)Single write() syscall

API Reference

Lifecycle

FunctionSignatureDescriptionThread Safety
xBufferCreatexBuffer xBufferCreate(size_t initial_cap)Create a buffer. Min capacity is 64.Not thread-safe
xBufferDestroyvoid xBufferDestroy(xBuffer buf)Free the buffer. NULL is a no-op.Not thread-safe
xBufferResetvoid xBufferReset(xBuffer buf)Discard all data, keep memory.Not thread-safe

Write

FunctionSignatureDescriptionThread Safety
xBufferAppendxErrno xBufferAppend(xBuffer *bufp, const void *data, size_t len)Append bytes, growing if needed.Not thread-safe
xBufferAppendStrxErrno xBufferAppendStr(xBuffer *bufp, const char *str)Append a C string (excluding NUL).Not thread-safe
xBufferReservexErrno xBufferReserve(xBuffer *bufp, size_t additional)Ensure at least additional writable bytes.Not thread-safe

Read

FunctionSignatureDescriptionThread Safety
xBufferDataconst void *xBufferData(xBuffer buf)Pointer to readable data. Valid until next mutation.Not thread-safe
xBufferLensize_t xBufferLen(xBuffer buf)Number of readable bytes.Not thread-safe
xBufferCapsize_t xBufferCap(xBuffer buf)Total allocated capacity.Not thread-safe
xBufferWritablesize_t xBufferWritable(xBuffer buf)Writable bytes (cap - wpos).Not thread-safe
xBufferConsumevoid xBufferConsume(xBuffer buf, size_t n)Advance read position by n bytes.Not thread-safe
xBufferCompactvoid xBufferCompact(xBuffer buf)Move unread data to front, maximize writable space.Not thread-safe

I/O Helpers

FunctionSignatureDescriptionThread Safety
xBufferReadFdssize_t xBufferReadFd(xBuffer *bufp, int fd)Read from fd into buffer (ensures 4KB space).Not thread-safe
xBufferWriteFdssize_t xBufferWriteFd(xBuffer buf, int fd)Write readable data to fd, consume written bytes.Not thread-safe

Usage Examples

Basic Append and Read

#include <stdio.h>
#include <xbuf/buf.h>

int main(void) {
    xBuffer buf = xBufferCreate(256);

    // Append data
    xBufferAppend(&buf, "Hello, ", 7);
    xBufferAppendStr(&buf, "World!");

    // Read data
    printf("Content: %.*s\n", (int)xBufferLen(buf),
           (const char *)xBufferData(buf));
    // Output: Content: Hello, World!

    // Consume partial data
    xBufferConsume(buf, 7);
    printf("After consume: %.*s\n", (int)xBufferLen(buf),
           (const char *)xBufferData(buf));
    // Output: After consume: World!

    // Compact to reclaim consumed space
    xBufferCompact(buf);

    xBufferDestroy(buf);
    return 0;
}

Network I/O

#include <xbuf/buf.h>
#include <unistd.h>

void handle_connection(int sockfd) {
    xBuffer buf = xBufferCreate(4096);

    // Read from socket
    ssize_t n = xBufferReadFd(&buf, sockfd);
    if (n > 0) {
        // Process data...
        // Write response back
        xBufferAppendStr(&buf, "HTTP/1.1 200 OK\r\n\r\n");
        xBufferWriteFd(buf, sockfd);
    }

    xBufferDestroy(buf);
}

Use Cases

  1. HTTP Response Accumulation — Accumulate response body chunks of unknown total size. The auto-growing behavior handles variable-length responses.

  2. Protocol Parsing — Append incoming data, parse complete messages from the front, consume parsed bytes. The compact operation reclaims space without reallocation.

  3. Log Message Formatting — Build log messages incrementally with multiple append calls before flushing.

Best Practices

  • Always pass &buf to write APIs. Functions that may grow the buffer take xBuffer *bufp because realloc() may relocate the object.
  • Call xBufferCompact() periodically if you consume data incrementally. This avoids unnecessary reallocation by reclaiming consumed space.
  • Check return values. xBufferAppend() and xBufferReserve() return xErrno_NoMemory on allocation failure.
  • Don't cache xBufferData() pointers across mutating calls. Any append/reserve/compact may invalidate the pointer.

Comparison with Other Libraries

Featurexbuf buf.hGo bytes.BufferRust Vec<u8>C++ std::vector<char>
LayoutHeader + data in one allocation (FAM)Separate header + sliceHeap-allocated arrayHeap-allocated array
Growth2x realloc + compact2x (with copy)2x (with copy)Implementation-defined
Read/Write cursorsYes (rpos/wpos)Yes (read offset)No (manual tracking)No (manual tracking)
CompactBuilt-in (xBufferCompact)Built-in (implicit)ManualManual
I/O helpersReadFd/WriteFdReadFrom/WriteToVia Read/Write traitsNo
Handle invalidationCaller updates via *bufpGC handlesBorrow checkerIterator invalidation

Key Differentiator: xBuffer's single-allocation layout (flexible array member) eliminates one level of pointer indirection compared to typical buffer implementations. The compact-before-grow strategy minimizes reallocation frequency for append-consume workloads.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbuf/buf_bench.cpp

BenchmarkChunk SizeTime (ns)CPU (ns)Throughput
BM_Buffer_Append164,7764,7763.1 GiB/s
BM_Buffer_Append644,4004,40013.5 GiB/s
BM_Buffer_Append2567,8927,89230.2 GiB/s
BM_Buffer_Append1,02421,83421,81143.7 GiB/s
BM_Buffer_Append4,09691,02990,95841.9 GiB/s
BM_Buffer_AppendConsume644,9994,99911.9 GiB/s
BM_Buffer_AppendConsume2568,2418,24028.9 GiB/s
BM_Buffer_AppendConsume1,02422,85922,85941.7 GiB/s

Key Observations:

  • Append throughput peaks at ~44 GiB/s for 1KB chunks, limited by memcpy bandwidth and reallocation overhead.
  • AppendConsume (interleaved append + consume) achieves comparable throughput to pure append, validating the compact-before-grow strategy — consumed space is reclaimed without reallocation.
  • Small chunks (16B) show lower throughput due to per-call overhead dominating the memcpy cost.

ring.h — Fixed-Size Ring Buffer

Introduction

ring.h provides xRingBuffer, a fixed-capacity circular buffer that never reallocates. It is ideal for bounded producer-consumer scenarios where a fixed memory budget is required. The capacity is rounded up to the next power of two internally, enabling bitmask indexing instead of expensive modulo operations.

Design Philosophy

  1. Fixed Capacity, Zero Reallocation — Once created, the ring buffer never grows. Writes that exceed capacity return xErrno_NoMemory. This makes memory usage predictable and avoids allocation latency spikes.

  2. Power-of-Two Masking — The internal capacity is always a power of two. Index computation uses head & mask instead of head % cap, which is significantly faster on most architectures.

  3. Monotonic Cursorshead (write) and tail (read) grow monotonically and never wrap. The actual array index is computed via bitmask. This simplifies the full/empty distinction: head - tail gives the exact readable byte count.

  4. Single Allocation — Like xBuffer, the header and data area are allocated together using a flexible array member.

  5. Scatter-Gather I/O — The ring buffer provides ReadIov/WriteIov helpers that fill iovec arrays for efficient readv()/writev() syscalls, handling the wrap-around transparently.

Architecture

graph LR
    PRODUCER["Producer"] -->|"xRingBufferWrite"| RB["xRingBuffer<br/>(fixed capacity)"]
    RB -->|"xRingBufferRead"| CONSUMER["Consumer"]

    RB -->|"xRingBufferReadIov"| IOV1["iovec[2]"] -->|"writev()"| FD1["fd"]
    FD2["fd"] -->|"readv()"| IOV2["iovec[2]"] -->|"xRingBufferWriteIov"| RB

    style RB fill:#f5a623,color:#fff

Implementation Details

Memory Layout

Single malloc() allocation:
┌───────────────────────┬──────────────────────────────────────┐
│  xRingBuffer_ header  │  data[cap]  (flexible array member)  │
│  cap, mask, head, tail│                                      │
└───────────────────────┴──────────────────────────────────────┘

Circular data layout (cap=8, mask=7):
         tail & mask          head & mask
              ↓                    ↓
  ┌───┬───┬───┬───┬───┬───┬───┬───┐
  │   │   │ R │ R │ R │ W │   │   │
  └───┴───┴───┴───┴───┴───┴───┴───┘
  0   1   2   3   4   5   6   7

  R = readable data (tail..head)
  W = next write position

Internal Structure

XDEF_STRUCT(xRingBuffer_) {
    size_t cap;   // Capacity (power of two)
    size_t mask;  // cap - 1 (for bitmask indexing)
    size_t head;  // Write cursor (monotonic)
    size_t tail;  // Read cursor (monotonic)
    char   data[];// Flexible array member
};

Power-of-Two Rounding

static size_t next_pow2(size_t v) {
    if (v < 16) v = 16;
    v--;
    v |= v >> 1;
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;
    // v |= v >> 32;  (on 64-bit)
    return v + 1;
}

This ensures cap is always a power of two, so mask = cap - 1 produces a valid bitmask. For example, cap = 8mask = 0b111.

Bitmask Indexing

Instead of:

size_t idx = head % cap;  // Expensive division

The ring buffer uses:

size_t idx = head & mask;  // Single AND instruction

This works because cap is a power of two: x % (2^n) == x & (2^n - 1).

Wrap-Around Write

flowchart TD
    WRITE["xRingBufferWrite(rb, data, len)"]
    CHECK{"len <= writable?"}
    FAIL["Return xErrno_NoMemory"]
    POS["pos = head & mask"]
    FIRST["first = cap - pos"]
    WRAP{"len <= first?"}
    SINGLE["memcpy(data+pos, src, len)"]
    SPLIT["memcpy(data+pos, src, first)<br/>memcpy(data, src+first, len-first)"]
    ADVANCE["head += len"]

    WRITE --> CHECK
    CHECK -->|No| FAIL
    CHECK -->|Yes| POS --> FIRST --> WRAP
    WRAP -->|Yes| SINGLE --> ADVANCE
    WRAP -->|No| SPLIT --> ADVANCE

    style FAIL fill:#e74c3c,color:#fff
    style ADVANCE fill:#50b86c,color:#fff

Operations and Complexity

OperationTime ComplexityNotes
xRingBufferWriteO(n)Up to 2 memcpy calls
xRingBufferReadO(n)Up to 2 memcpy calls
xRingBufferPeekO(n)Like Read but doesn't advance tail
xRingBufferDiscardO(1)Just advances tail
xRingBufferLenO(1)head - tail
xRingBufferReadFdO(1)Single readv() syscall
xRingBufferWriteFdO(1)Single writev() syscall

API Reference

Lifecycle

FunctionSignatureDescriptionThread Safety
xRingBufferCreatexRingBuffer xRingBufferCreate(size_t min_cap)Create a ring buffer. Capacity rounded up to power of 2.Not thread-safe
xRingBufferDestroyvoid xRingBufferDestroy(xRingBuffer rb)Free the ring buffer. NULL is a no-op.Not thread-safe
xRingBufferResetvoid xRingBufferReset(xRingBuffer rb)Discard all data, keep memory.Not thread-safe

Query

FunctionSignatureDescriptionThread Safety
xRingBufferLensize_t xRingBufferLen(xRingBuffer rb)Readable bytes.Not thread-safe
xRingBufferCapsize_t xRingBufferCap(xRingBuffer rb)Total capacity.Not thread-safe
xRingBufferWritablesize_t xRingBufferWritable(xRingBuffer rb)Writable bytes.Not thread-safe
xRingBufferEmptybool xRingBufferEmpty(xRingBuffer rb)True if no readable data.Not thread-safe
xRingBufferFullbool xRingBufferFull(xRingBuffer rb)True if no writable space.Not thread-safe

Write

FunctionSignatureDescriptionThread Safety
xRingBufferWritexErrno xRingBufferWrite(xRingBuffer rb, const void *data, size_t len)Write bytes. Returns xErrno_NoMemory if full.Not thread-safe

Read

FunctionSignatureDescriptionThread Safety
xRingBufferReadsize_t xRingBufferRead(xRingBuffer rb, void *out, size_t len)Read and consume bytes. Returns actual count.Not thread-safe
xRingBufferPeeksize_t xRingBufferPeek(xRingBuffer rb, void *out, size_t len)Read without consuming.Not thread-safe
xRingBufferDiscardsize_t xRingBufferDiscard(xRingBuffer rb, size_t n)Discard bytes without copying.Not thread-safe

I/O Helpers

FunctionSignatureDescriptionThread Safety
xRingBufferReadIovint xRingBufferReadIov(xRingBuffer rb, struct iovec iov[2])Fill iovecs with readable regions (for writev).Not thread-safe
xRingBufferWriteIovint xRingBufferWriteIov(xRingBuffer rb, struct iovec iov[2])Fill iovecs with writable regions (for readv).Not thread-safe
xRingBufferReadFdssize_t xRingBufferReadFd(xRingBuffer rb, int fd)Read from fd using readv().Not thread-safe
xRingBufferWriteFdssize_t xRingBufferWriteFd(xRingBuffer rb, int fd)Write to fd using writev().Not thread-safe

Usage Examples

Basic FIFO

#include <stdio.h>
#include <xbuf/ring.h>

int main(void) {
    // Request 1000 bytes; actual capacity will be 1024 (next power of 2)
    xRingBuffer rb = xRingBufferCreate(1000);
    printf("Capacity: %zu\n", xRingBufferCap(rb)); // 1024

    // Write data
    const char *msg = "Hello, Ring!";
    xRingBufferWrite(rb, msg, 12);

    // Read data
    char out[32];
    size_t n = xRingBufferRead(rb, out, sizeof(out));
    printf("Read %zu bytes: %.*s\n", n, (int)n, out);

    xRingBufferDestroy(rb);
    return 0;
}

Network Socket Buffer

#include <xbuf/ring.h>

void event_loop_handler(int sockfd) {
    xRingBuffer rb = xRingBufferCreate(65536); // 64KB ring

    // Read from socket into ring buffer
    ssize_t n = xRingBufferReadFd(rb, sockfd);
    if (n > 0) {
        // Process data...
        // Write processed data back
        xRingBufferWriteFd(rb, sockfd);
    }

    xRingBufferDestroy(rb);
}

Use Cases

  1. Fixed-Budget Network Buffers — When you need predictable memory usage per connection (e.g., 64KB per socket), the ring buffer provides a hard capacity limit.

  2. Logging Ring Buffer — Capture the last N bytes of log output, automatically discarding old data when the buffer wraps.

  3. Inter-Thread Communication — With external synchronization, a ring buffer can serve as a bounded channel between producer and consumer threads.

Best Practices

  • Choose capacity carefully. The ring buffer never grows. If you write more than the capacity, the write fails. Size it for your worst-case scenario.
  • Use scatter-gather I/O. xRingBufferReadFd/WriteFd use readv()/writev() to handle wrap-around in a single syscall, avoiding the need to linearize data.
  • Be aware of power-of-two rounding. Requesting 1000 bytes gives you 1024. Requesting 1025 gives you 2048. Plan accordingly.
  • Check xRingBufferWritable() before writing if you want to handle partial writes gracefully.

Comparison with Other Libraries

Featurexbuf ring.hLinux kfifoBoost circular_bufferDPDK rte_ring
CapacityFixed, power-of-2Fixed, power-of-2Fixed, any sizeFixed, power-of-2
IndexingBitmaskBitmaskModuloBitmask
LayoutFAM (single alloc)Separate allocHeap arrayHuge pages
Thread SafetyNot thread-safeSingle-producer/single-consumerNot thread-safeMulti-producer/multi-consumer
I/O Helpersreadv/writevkfifo_to_user/kfifo_from_userNoNo (packet-oriented)
LanguageC99C (kernel)C++C

Key Differentiator: xbuf's ring buffer combines the power-of-two bitmask optimization (like kfifo) with scatter-gather I/O helpers (readv/writev) in a single-allocation design. It's purpose-built for event-driven network programming where fixed memory budgets and efficient syscalls are essential.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbuf/ring_bench.cpp

BenchmarkSizeTime (ns)CPU (ns)Throughput
BM_Ring_WriteRead646.056.0519.7 GiB/s
BM_Ring_WriteRead25616.816.828.4 GiB/s
BM_Ring_WriteRead1,02427.427.469.6 GiB/s
BM_Ring_WriteRead4,09699.299.276.9 GiB/s
BM_Ring_Throughput4,09622522517.0 GiB/s
BM_Ring_Throughput16,38480680618.9 GiB/s
BM_Ring_Throughput65,5363,1983,19819.1 GiB/s

Key Observations:

  • WriteRead (single write + read cycle) achieves up to ~77 GiB/s at 4KB chunks, demonstrating the efficiency of the bitmask-based wrap-around and memcpy for larger transfers.
  • Throughput (sustained writes until full) stabilizes at ~19 GiB/s regardless of capacity, showing consistent performance as the ring scales.
  • The ring buffer's zero-overhead indexing (bitmask instead of modulo) keeps per-operation cost extremely low — just 6ns for a 64-byte write+read cycle.

io.h — Reference-Counted Block-Chain I/O Buffer

Introduction

io.h provides xIOBuffer, a non-contiguous byte buffer composed of a chain of reference-counted memory blocks. It supports zero-copy split, append, and scatter-gather I/O (readv/writev). Inspired by brpc's IOBuf, it is designed for high-throughput network I/O where avoiding memory copies is critical.

Design Philosophy

  1. Block-Chain Architecture — Data is stored across multiple fixed-size blocks (default 8KB each), linked through a reference array. This avoids large contiguous allocations and enables zero-copy operations.

  2. Reference Counting — Each xIOBlock is reference-counted. Multiple xIOBuffer instances can share the same block (e.g., after a Cut operation). Blocks are freed (returned to pool) when the last reference is released.

  3. Zero-Copy OperationsxIOBufferAppendIOBuffer() transfers block references without copying data. xIOBufferCut() splits a buffer by adjusting offsets and sharing blocks at the boundary.

  4. Lock-Free Block Pool — Released blocks are returned to a global Treiber stack (lock-free) for reuse, avoiding malloc/free overhead in steady state.

  5. Inline Ref Array — Small buffers (≤ 8 refs) use an inline array, avoiding heap allocation for the ref array itself. Larger buffers transition to a heap-allocated array.

Architecture

graph TD
    subgraph "xIOBuffer API"
        APPEND["Append / AppendStr"]
        APPEND_IO["AppendIOBuffer<br/>(zero-copy)"]
        READ["Read / CopyTo"]
        CUT["Cut<br/>(zero-copy split)"]
        CONSUME["Consume"]
        IO_READ["ReadFd"]
        IO_WRITE["WriteFd<br/>(writev)"]
    end

    subgraph "Block Management"
        ACQUIRE["xIOBlockAcquire"]
        RETAIN["xIOBlockRetain"]
        RELEASE["xIOBlockRelease"]
    end

    subgraph "Block Pool (Treiber Stack)"
        POOL["g_pool_head"]
        WARMUP["xIOBlockPoolWarmup"]
        DRAIN["xIOBlockPoolDrain"]
    end

    APPEND --> ACQUIRE
    IO_READ --> ACQUIRE
    CUT --> RETAIN
    CONSUME --> RELEASE
    READ --> RELEASE
    ACQUIRE --> POOL
    RELEASE --> POOL
    WARMUP --> POOL
    DRAIN --> POOL

    style POOL fill:#f5a623,color:#fff

Implementation Details

Block Structure

XDEF_STRUCT(xIOBlock) {
    size_t refs;                       // Reference count (atomic)
    size_t size;                       // Usable data size
    char   data[XIOBUFFER_BLOCK_SIZE]; // 8KB inline data
};

Reference Structure

XDEF_STRUCT(xIOBufferRef) {
    xIOBlock *block;   // Pointer to the underlying block
    size_t    offset;  // Start offset within block->data
    size_t    length;  // Number of valid bytes from offset
};

IOBuffer Structure

XDEF_STRUCT(xIOBuffer) {
    xIOBufferRef  inlined[XIOBUFFER_INLINE_REFS]; // Inline ref storage (8)
    xIOBufferRef *refs;    // Pointer to ref array (inlined or heap)
    size_t        nrefs;   // Number of active refs
    size_t        cap;     // Capacity of refs array
    size_t        nbytes;  // Total logical byte count (cached)
};

Block-Chain Architecture

graph TD
    subgraph "xIOBuffer"
        REF1["Ref 0<br/>block=A, off=0, len=8192"]
        REF2["Ref 1<br/>block=B, off=0, len=8192"]
        REF3["Ref 2<br/>block=C, off=0, len=3000"]
    end

    subgraph "Shared Blocks"
        A["xIOBlock A<br/>refs=1, 8KB"]
        B["xIOBlock B<br/>refs=2, 8KB"]
        C["xIOBlock C<br/>refs=1, 8KB"]
    end

    REF1 --> A
    REF2 --> B
    REF3 --> C

    subgraph "Another xIOBuffer (after Cut)"
        REF4["Ref 0<br/>block=B, off=4096, len=4096"]
    end

    REF4 --> B

    style A fill:#4a90d9,color:#fff
    style B fill:#f5a623,color:#fff
    style C fill:#50b86c,color:#fff

Treiber Stack Block Pool

The global block pool uses a lock-free Treiber stack:

// Pool node overlays xIOBlock memory
XDEF_STRUCT(PoolNode_) {
    PoolNode_ *next;
};

static PoolNode_ *volatile g_pool_head = NULL;

Push (return to pool):

do {
    head = atomic_load(g_pool_head)
    node->next = head
} while (!CAS(g_pool_head, head, node))

Pop (acquire from pool):

do {
    head = atomic_load(g_pool_head)
    if (!head) return malloc(new block)
    next = head->next
} while (!CAS(g_pool_head, head, next))
return head

Zero-Copy Cut

xIOBufferCut(io, dst, n) moves the first n bytes from io to dst:

  1. Fully consumed refs — Ownership transfers directly (no refcount change).
  2. Boundary ref — The block is shared: xIOBlockRetain() increments the refcount, and both buffers hold a ref with different offset/length.
flowchart TD
    CUT["xIOBufferCut(io, dst, n)"]
    LOOP{"More bytes to cut?"}
    FULL{"ref.length <= remaining?"}
    TRANSFER["Transfer entire ref to dst<br/>(no refcount change)"]
    SPLIT["Share block: Retain + split ref<br/>dst gets [offset, chunk]<br/>io keeps [offset+chunk, rest]"]
    SHIFT["Shift consumed refs out of io"]
    DONE["Update nbytes for both"]

    CUT --> LOOP
    LOOP -->|Yes| FULL
    FULL -->|Yes| TRANSFER --> LOOP
    FULL -->|No| SPLIT --> SHIFT --> DONE
    LOOP -->|No| SHIFT

    style TRANSFER fill:#50b86c,color:#fff
    style SPLIT fill:#f5a623,color:#fff

Append Strategy

xIOBufferAppend(io, data, len):

  1. First tries to fill the tail block's remaining space (avoids allocating a new block for small appends).
  2. Allocates new blocks for remaining data, each up to XIOBUFFER_BLOCK_SIZE bytes.

API Reference

Configuration

MacroDefaultDescription
XIOBUFFER_BLOCK_SIZE8192Block data size in bytes
XIOBUFFER_INLINE_REFS8Inline ref array capacity

Block API

FunctionSignatureDescriptionThread Safety
xIOBlockAcquirexIOBlock *xIOBlockAcquire(void)Get a block from pool (or malloc). refs=1.Thread-safe (lock-free pool)
xIOBlockRetainvoid xIOBlockRetain(xIOBlock *blk)Increment refcount.Thread-safe (atomic)
xIOBlockReleasevoid xIOBlockRelease(xIOBlock *blk)Decrement refcount; return to pool at 0.Thread-safe (atomic + lock-free pool)
xIOBlockPoolWarmupxErrno xIOBlockPoolWarmup(size_t n)Pre-allocate n blocks into pool.Thread-safe
xIOBlockPoolDrainvoid xIOBlockPoolDrain(void)Free all pooled blocks. Call at shutdown.Not thread-safe (no concurrent use)

IOBuffer Lifecycle

FunctionSignatureDescriptionThread Safety
xIOBufferInitvoid xIOBufferInit(xIOBuffer *io)Initialize an empty IOBuffer.Not thread-safe
xIOBufferDeinitvoid xIOBufferDeinit(xIOBuffer *io)Release all refs and free ref array.Not thread-safe
xIOBufferResetvoid xIOBufferReset(xIOBuffer *io)Release all refs, keep ref array.Not thread-safe

IOBuffer Query

FunctionSignatureDescriptionThread Safety
xIOBufferLensize_t xIOBufferLen(const xIOBuffer *io)Total readable bytes.Not thread-safe
xIOBufferEmptybool xIOBufferEmpty(const xIOBuffer *io)True if no data.Not thread-safe
xIOBufferRefCountsize_t xIOBufferRefCount(const xIOBuffer *io)Number of block refs.Not thread-safe

IOBuffer Write

FunctionSignatureDescriptionThread Safety
xIOBufferAppendxErrno xIOBufferAppend(xIOBuffer *io, const void *data, size_t len)Append bytes (allocates blocks as needed).Not thread-safe
xIOBufferAppendStrxErrno xIOBufferAppendStr(xIOBuffer *io, const char *str)Append C string.Not thread-safe
xIOBufferAppendIOBufferxErrno xIOBufferAppendIOBuffer(xIOBuffer *io, xIOBuffer *other)Zero-copy: move all refs from other.Not thread-safe

IOBuffer Read

FunctionSignatureDescriptionThread Safety
xIOBufferReadsize_t xIOBufferRead(xIOBuffer *io, void *out, size_t len)Copy and consume bytes.Not thread-safe
xIOBufferCutsize_t xIOBufferCut(xIOBuffer *io, xIOBuffer *dst, size_t n)Zero-copy split: move first n bytes to dst.Not thread-safe
xIOBufferConsumesize_t xIOBufferConsume(xIOBuffer *io, size_t n)Discard first n bytes.Not thread-safe
xIOBufferCopyTosize_t xIOBufferCopyTo(const xIOBuffer *io, void *out)Linearize: copy all data to contiguous buffer.Not thread-safe

IOBuffer I/O

FunctionSignatureDescriptionThread Safety
xIOBufferReadIovint xIOBufferReadIov(const xIOBuffer *io, struct iovec *iov, int max_iov)Fill iovecs for writev().Not thread-safe
xIOBufferReadFdssize_t xIOBufferReadFd(xIOBuffer *io, int fd)Read from fd into IOBuffer.Not thread-safe
xIOBufferWriteFdssize_t xIOBufferWriteFd(xIOBuffer *io, int fd)Write to fd using writev().Not thread-safe

Usage Examples

Basic Usage

#include <stdio.h>
#include <xbuf/io.h>

int main(void) {
    xIOBuffer io;
    xIOBufferInit(&io);

    // Append data (may span multiple blocks)
    xIOBufferAppend(&io, "Hello, ", 7);
    xIOBufferAppend(&io, "IOBuffer!", 9);

    printf("Length: %zu, Refs: %zu\n",
           xIOBufferLen(&io), xIOBufferRefCount(&io));

    // Linearize for processing
    char buf[64];
    xIOBufferCopyTo(&io, buf);
    printf("Content: %.*s\n", (int)xIOBufferLen(&io), buf);

    xIOBufferDeinit(&io);
    return 0;
}

Zero-Copy Split (Protocol Parsing)

#include <xbuf/io.h>

void parse_protocol(xIOBuffer *io) {
    // Cut the 4-byte header from the front
    xIOBuffer header;
    xIOBufferInit(&header);

    size_t cut = xIOBufferCut(io, &header, 4);
    if (cut == 4) {
        char hdr[4];
        xIOBufferRead(&header, hdr, 4);
        // Parse header...
        // io now contains only the body (zero-copy!)
    }

    xIOBufferDeinit(&header);
}

High-Throughput Network I/O

#include <xbuf/io.h>

void handle_data(int sockfd) {
    // Pre-warm the block pool at startup
    xIOBlockPoolWarmup(64);

    xIOBuffer io;
    xIOBufferInit(&io);

    // Read from socket (allocates blocks from pool)
    ssize_t n = xIOBufferReadFd(&io, sockfd);
    if (n > 0) {
        // Write back using scatter-gather I/O
        xIOBufferWriteFd(&io, sockfd);
    }

    xIOBufferDeinit(&io);

    // At shutdown
    xIOBlockPoolDrain();
}

Use Cases

  1. HTTP Response Body — The xhttp module uses xIOBuffer to accumulate response chunks from libcurl without copying between buffers.

  2. Protocol Framing — Use xIOBufferCut() to split headers from body in a zero-copy fashion, then process each part independently.

  3. Data Pipeline — Chain multiple processing stages that each append to or cut from xIOBuffer instances, sharing blocks to minimize copies.

Best Practices

  • Call xIOBlockPoolWarmup() at startup to pre-allocate blocks and avoid allocation spikes during initial traffic.
  • Call xIOBlockPoolDrain() at shutdown for clean valgrind reports.
  • Use xIOBufferAppendIOBuffer() instead of copying when combining buffers. It transfers ownership without data copies.
  • Use xIOBufferCut() for protocol parsing. It's more efficient than xIOBufferRead() when you need to pass the cut data to another component.
  • Monitor xIOBufferRefCount() to understand memory fragmentation. Many small refs may indicate suboptimal block utilization.

Comparison with Other Libraries

Featurexbuf io.hbrpc IOBufNetty ByteBufGo bytes.Buffer
ArchitectureBlock-chain (ref array)Block-chain (linked list)Composite bufferContiguous slice
Block Size8KB (configurable)8KBConfigurableN/A
Reference CountingAtomic (per block)Atomic (per block)Atomic (per buffer)GC
Zero-Copy SplitxIOBufferCutcutnsliceNo
Zero-Copy AppendxIOBufferAppendIOBufferappend(IOBuf)addComponentNo
Block PoolTreiber stack (lock-free)Thread-local + globalArena allocatorN/A
Scatter-Gather I/Owritev via ReadIovwritev via pappendnioBuffersNo
Inline Optimization8 inline refsNoNoN/A
LanguageC99C++JavaGo

Key Differentiator: xbuf's xIOBuffer combines brpc-style block-chain architecture with a lock-free Treiber stack block pool and inline ref optimization. The zero-copy Cut and AppendIOBuffer operations make it ideal for protocol parsing and data pipeline scenarios in C.

Benchmark

Environment: Apple M3 Pro, 36 GB RAM, macOS 26.4, Release build (-O2). Source: xbuf/io_bench.cpp

BenchmarkSizeTime (ns)CPU (ns)Throughput
BM_IOBuffer_Append643,7203,72016.0 GiB/s
BM_IOBuffer_Append2567,5697,56831.5 GiB/s
BM_IOBuffer_Append1,02422,34122,34042.7 GiB/s
BM_IOBuffer_Append4,09679,79679,79447.8 GiB/s
BM_IOBuffer_Append8,192187,167187,16540.8 GiB/s
BM_IOBuffer_AppendConsume645,2305,23011.4 GiB/s
BM_IOBuffer_AppendConsume2568,2328,23229.0 GiB/s
BM_IOBuffer_AppendConsume1,02423,04023,04041.4 GiB/s
BM_IOBuffer_Cut8,19216716745.6 GiB/s
BM_IOBuffer_Cut65,5361,6511,65137.0 GiB/s
BM_IOBuffer_Cut262,1448,1228,12230.1 GiB/s
BM_IOBuffer_AppendIOBuffer1,0243,1963,19629.8 GiB/s
BM_IOBuffer_AppendIOBuffer4,0969,3079,30741.0 GiB/s
BM_IOBuffer_AppendIOBuffer8,19217,60417,60243.3 GiB/s
BM_IOBuffer_BlockPool8.918.89

Key Observations:

  • Append peaks at ~48 GiB/s for 4KB chunks. The slight drop at 8KB reflects block boundary crossing overhead.
  • Cut (zero-copy split) is extremely fast — 167ns for 8KB — because it only manipulates reference metadata, not data. This validates the block-chain architecture for protocol parsing.
  • AppendIOBuffer (zero-copy concatenation) achieves ~43 GiB/s, confirming that block ownership transfer avoids data copies.
  • BlockPool acquire/release cycle takes ~9ns, showing the lock-free Treiber stack's efficiency for block recycling.

xnet — Networking Primitives

Introduction

xnet is xKit's networking utility module, providing three foundational components for network programming: a lightweight URL parser, an asynchronous DNS resolver, and shared TLS configuration types. These building blocks are used internally by higher-level modules like xhttp, and are also available for direct use in application code.

Design Philosophy

  1. Zero-Copy URL ParsingxUrlParse() makes a single internal copy of the input string. All component fields (scheme, host, port, etc.) are pointer+length pairs referencing this copy, avoiding per-field allocations.

  2. Async DNS via Thread-Pool Offload — DNS resolution uses getaddrinfo() offloaded to the event loop's thread pool. The callback is always invoked on the event loop thread, keeping the async programming model consistent with the rest of xKit.

  3. Shared TLS TypesxTlsConf is a plain data structure shared across modules. It decouples TLS configuration from any specific TLS backend (OpenSSL, mbedTLS).

  4. Async TCP with Transport AbstractionxTcpConnect chains DNS → connect → optional TLS handshake into a single async operation. xTcpConn wraps an xSocket + xTransport vtable, providing Recv/Send/SendIov helpers that work transparently over plain TCP or TLS.

Architecture

graph TD
    subgraph "xnet Module"
        URL["xUrl<br/>URL Parser<br/>url.h"]
        DNS["xDnsResolve<br/>Async DNS<br/>dns.h"]
        TLS["xTlsConf<br/>TLS Config Types<br/>tls.h"]
        TCP["xTcpConn / xTcpConnect / xTcpListener<br/>Async TCP<br/>tcp.h"]
    end

    subgraph "xbase Infrastructure"
        EV["xEventLoop<br/>event.h"]
        POOL["Thread Pool<br/>xEventLoopSubmit()"]
        ATOMIC["Atomic Ops<br/>atomic.h"]
    end

    subgraph "Consumers"
        HTTP_C["xhttp Client"]
        HTTP_S["xhttp Server"]
        WS["WebSocket"]
    end

    DNS --> EV
    DNS --> POOL
    DNS --> ATOMIC
    TCP --> EV
    TCP --> DNS
    TCP --> TLS

    HTTP_C --> URL
    HTTP_C --> TCP
    HTTP_S --> TCP
    WS --> URL
    WS --> TCP

    style URL fill:#4a90d9,color:#fff
    style DNS fill:#50b86c,color:#fff
    style TLS fill:#f5a623,color:#fff
    style TCP fill:#e74c3c,color:#fff

Sub-Module Overview

HeaderComponentDescriptionDoc
url.hxUrlLightweight URL parserurl.md
dns.hxDnsResolveAsync DNS resolutiondns.md
tls.hxTlsConfShared TLS config typestls.md
tcp.hxTcpConn / xTcpConnect / xTcpListenerAsync TCP connection, connector & listenertcp.md

Quick Start

#include <stdio.h>
#include <xbase/event.h>
#include <xnet/url.h>
#include <xnet/dns.h>
#include <xnet/tls.h>

// 1. Parse a URL
static void url_example(void) {
    xUrl url;
    xErrno err = xUrlParse(
        "wss://example.com:8443/ws?token=abc", &url);
    if (err == xErrno_Ok) {
        printf("scheme: %.*s\n",
               (int)url.scheme_len, url.scheme);
        printf("host:   %.*s\n",
               (int)url.host_len, url.host);
        printf("port:   %u\n", xUrlPort(&url));
        printf("path:   %.*s\n",
               (int)url.path_len, url.path);
        xUrlFree(&url);
    }
}

// 2. Async DNS resolution
static void on_resolved(xDnsResult *result, void *arg) {
    (void)arg;
    if (result->error == xErrno_Ok) {
        int count = 0;
        for (xDnsAddr *a = result->addrs; a; a = a->next)
            count++;
        printf("Resolved %d address(es)\n", count);
    }
    xDnsResultFree(result);
    // stop the loop after resolution
}

static void dns_example(xEventLoop loop) {
    xDnsResolve(loop, "example.com", "443",
                NULL, on_resolved, NULL);
}

// 3. TLS configuration
static void tls_example(void) {
    xTlsConf client_tls = {0};
    client_tls.ca = "ca.pem";

    xTlsConf server_tls = {
        .cert = "server.pem",
        .key  = "server-key.pem",
    };
    (void)client_tls;
    (void)server_tls;
}

Relationship with Other Modules

  • xbase — The DNS resolver depends on xEventLoop for thread-pool offload and uses atomic.h for the cancellation flag.
  • xhttp — The HTTP client uses xUrl for URL parsing, xDnsResolve for hostname resolution, and xTlsConf for TLS configuration. The WebSocket client supports both xTlsConf and a shared xTlsCtx for wss:// connections. See the TLS Deployment Guide for end-to-end examples.
  • WebSocket — The WebSocket client uses xUrl to parse ws:// and wss:// URLs, and optionally accepts a shared xTlsCtx to avoid per-connection TLS context creation.

url.h — Lightweight URL Parser

Introduction

url.h provides xUrl, a lightweight URL parser that decomposes a URL string into its RFC 3986 components: scheme, userinfo, host, port, path, query, and fragment. The parser makes a single internal copy of the input; all component fields are pointer+length pairs referencing this copy, so the caller may discard the original string immediately after parsing.

Design Philosophy

  1. Single Copy, Zero Per-Field AllocationxUrlParse() calls strdup() once. All output fields point into this copy, avoiding per-component heap allocations.

  2. Pointer+Length Pairs — Fields use const char * + size_t pairs rather than NUL-terminated strings. This avoids mutating the internal copy and supports efficient substring access.

  3. Scheme-Aware Default PortsxUrlPort() returns well-known default ports (80 for http/ws, 443 for https/wss) when no explicit port is present, simplifying connection logic.

  4. IPv6 Literal Support — The parser correctly handles bracketed IPv6 addresses ([::1]:8080), extracting the bare address without brackets.

Architecture

flowchart LR
    INPUT["Raw URL string"]
    PARSE["xUrlParse()"]
    COPY["strdup() internal copy"]
    FIELDS["Pointer+Length fields"]
    PORT["xUrlPort()"]
    FREE["xUrlFree()"]

    INPUT --> PARSE
    PARSE --> COPY
    COPY --> FIELDS
    FIELDS --> PORT
    FIELDS --> FREE

    style PARSE fill:#4a90d9,color:#fff
    style FREE fill:#e74c3c,color:#fff

Implementation Details

URL Format

scheme://[userinfo@]host[:port][/path][?query][#fragment]

Parsing Steps

flowchart TD
    START["Input: raw URL string"]
    SCHEME["Find '://' → extract scheme"]
    AUTH["Parse authority section"]
    USERINFO{"Contains '@'?"}
    UI_YES["Extract userinfo"]
    HOST{"Starts with '['?"}
    IPV6["Parse IPv6 bracket literal"]
    IPV4["Scan backwards for ':'"]
    PORT["Extract port (if present)"]
    PATH{"Starts with '/'?"}
    PATH_YES["Extract path"]
    QUERY{"Starts with '?'?"}
    QUERY_YES["Extract query"]
    FRAG{"Starts with '#'?"}
    FRAG_YES["Extract fragment"]
    DONE["Return xErrno_Ok"]

    START --> SCHEME --> AUTH
    AUTH --> USERINFO
    USERINFO -->|Yes| UI_YES --> HOST
    USERINFO -->|No| HOST
    HOST -->|Yes| IPV6 --> PORT
    HOST -->|No| IPV4 --> PORT
    PORT --> PATH
    PATH -->|Yes| PATH_YES --> QUERY
    PATH -->|No| QUERY
    QUERY -->|Yes| QUERY_YES --> FRAG
    QUERY -->|No| FRAG
    FRAG -->|Yes| FRAG_YES --> DONE
    FRAG -->|No| DONE

    style DONE fill:#50b86c,color:#fff

Memory Layout

xUrl struct (stack or heap):
┌──────────┬──────────────────────────────────┐
│  raw_    │→ strdup("https://host:443/path") │
│  scheme  │→ ───────┘                        │
│  host    │→ ──────────────┘                 │
│  port    │→ ───────────────────┘            │
│  path    │→ ────────────────────────┘       │
│  ...     │                                  │
└──────────┴──────────────────────────────────┘
All pointers reference the single raw_ copy.

Operations and Complexity

OperationComplexityNotes
xUrlParseO(n)Single pass over the URL string
xUrlPortO(1)Converts port string or returns default
xUrlFreeO(1)Frees the internal copy, zeroes struct

API Reference

Lifecycle

FunctionSignatureDescription
xUrlParsexErrno xUrlParse(const char *raw, xUrl *url)Parse a URL into components
xUrlFreevoid xUrlFree(xUrl *url)Free internal copy, zero all fields

Query

FunctionSignatureDescription
xUrlPortuint16_t xUrlPort(const xUrl *url)Numeric port (explicit or default by scheme)

xUrl Fields

FieldTypeDescription
scheme / scheme_lenconst char * / size_te.g. "https"
userinfo / userinfo_lenconst char * / size_te.g. "user:pass" (optional)
host / host_lenconst char * / size_te.g. "example.com" or "::1"
port / port_lenconst char * / size_te.g. "8443" (optional)
path / path_lenconst char * / size_te.g. "/ws/chat" (optional)
query / query_lenconst char * / size_te.g. "key=val" (optional)
fragment / fragment_lenconst char * / size_te.g. "section1" (optional)

Note: Optional fields have ptr=NULL, len=0 when absent. The raw_ field is internal — do not access it.

Usage Examples

Basic URL Parsing

#include <stdio.h>
#include <xnet/url.h>

int main(void) {
    xUrl url;
    xErrno err = xUrlParse("https://user:[email protected]:8443/ws/chat?token=abc#top", &url);
    if (err != xErrno_Ok) {
        fprintf(stderr, "parse failed\n");
        return 1;
    }

    printf("scheme:   %.*s\n", (int)url.scheme_len, url.scheme);
    printf("userinfo: %.*s\n", (int)url.userinfo_len, url.userinfo);
    printf("host:     %.*s\n", (int)url.host_len, url.host);
    printf("port:     %.*s (numeric: %u)\n", (int)url.port_len, url.port, xUrlPort(&url));
    printf("path:     %.*s\n", (int)url.path_len, url.path);
    printf("query:    %.*s\n", (int)url.query_len, url.query);
    printf("fragment: %.*s\n", (int)url.fragment_len, url.fragment);

    xUrlFree(&url);
    return 0;
}

Output:

scheme:   https
userinfo: user:pass
host:     example.com
port:     8443 (numeric: 8443)
path:     /ws/chat
query:    token=abc
fragment: top

IPv6 Address

xUrl url;
xUrlParse("http://[::1]:8080/test", &url);

printf("host: %.*s\n", (int)url.host_len, url.host);
// Output: host: ::1  (brackets stripped)

printf("port: %u\n", xUrlPort(&url));
// Output: port: 8080

xUrlFree(&url);

Default Port by Scheme

xUrl url;
xUrlParse("wss://echo.example.com/sock", &url);

// No explicit port in URL
printf("port field: %s\n", url.port ? "present" : "absent");
// Output: port field: absent

// xUrlPort() returns 443 for wss://
printf("effective port: %u\n", xUrlPort(&url));
// Output: effective port: 443

xUrlFree(&url);

Ownership Semantics

// xUrl owns its data — the original string can be freed
char *heap = strdup("ws://example.com:9090/ws");
xUrl url;
xUrlParse(heap, &url);
free(heap);  // safe: xUrl has its own copy

// url fields are still valid here
printf("host: %.*s\n", (int)url.host_len, url.host);

xUrlFree(&url);
// After free, all fields are zeroed (NULL)

Error Handling

InputResult
NULL raw or url pointerxErrno_InvalidArg
Missing :// separatorxErrno_InvalidArg
Empty host (e.g. http:///path)xErrno_InvalidArg
Unclosed IPv6 bracketxErrno_InvalidArg
malloc failurexErrno_NoMemory

On error, the xUrl struct is zeroed — no cleanup needed.

Best Practices

  • Always check the return value of xUrlParse(). On error the struct is zeroed, so accessing fields is safe but yields empty values.
  • Use xUrlPort() instead of parsing the port string yourself. It handles default ports and validates the numeric range (0–65535).
  • Call xUrlFree() when done. Forgetting to free leaks the internal string copy.
  • Don't cache field pointers past xUrlFree(). All pointers become invalid after the free call.

dns.h — Asynchronous DNS Resolution

Introduction

dns.h provides asynchronous DNS resolution by offloading getaddrinfo() to the event loop's thread pool. The completion callback is always invoked on the event loop thread, maintaining xKit's single-threaded callback model. Queries can be cancelled before the callback fires.

Design Philosophy

  1. Thread-Pool Offloadgetaddrinfo() is a blocking POSIX call. Rather than introducing a dedicated DNS thread, xnet reuses the event loop's existing thread pool via xEventLoopSubmit().

  2. Event-Loop-Thread Callbacks — The done callback runs on the event loop thread, so user code never needs synchronization. This is consistent with every other callback in xKit.

  3. Linked-List Result — Resolved addresses are returned as a linked list of xDnsAddr nodes, preserving the full getaddrinfo() result (family, socktype, protocol) for each address.

  4. Cancellation SupportxDnsCancel() sets an atomic flag. If the worker has already finished, the done callback silently discards the result instead of invoking the user callback.

  5. IP Literal Fast Path — If the hostname is an IPv4 or IPv6 literal, AI_NUMERICHOST is set automatically, skipping the actual DNS lookup.

Architecture

sequenceDiagram
    participant App as Application
    participant EL as Event Loop Thread
    participant TP as Thread Pool Worker

    App->>EL: xDnsResolve(loop, "example.com", ...)
    EL->>TP: xEventLoopSubmit(dns_work_fn)
    Note over TP: getaddrinfo() (blocking)
    TP-->>EL: dns_done_fn(result)
    alt Not cancelled
        EL->>App: callback(result, arg)
    else Cancelled
        EL->>EL: xDnsResultFree(result)
    end

Implementation Details

Internal Request Lifecycle

stateDiagram-v2
    [*] --> Created: xDnsResolve()
    Created --> Queued: xEventLoopSubmit()
    Queued --> Working: Thread pool picks up
    Working --> Done: getaddrinfo() returns
    Done --> Delivered: callback invoked
    Done --> Discarded: cancelled flag set

    Queued --> Cancelled: xDnsCancel()
    Working --> Cancelled: xDnsCancel()
    Cancelled --> Discarded: done_fn checks flag

    Delivered --> [*]: request freed
    Discarded --> [*]: request freed

Error Mapping

getaddrinfo() returns EAI_* codes. These are mapped to xKit error codes:

EAI CodexErrnoMeaning
0 (success)xErrno_OkResolution succeeded
EAI_NONAMExErrno_DnsNotFoundHost not found
EAI_AGAINxErrno_DnsTempFailTemporary failure
EAI_MEMORYxErrno_NoMemoryOut of memory
OtherxErrno_DnsErrorGeneric DNS error

IP Literal Detection

Before calling getaddrinfo(), the worker checks if the hostname is an IP literal using inet_pton(). If it is, AI_NUMERICHOST is added to the hints, which tells getaddrinfo() to skip DNS lookup entirely.

// Pseudocode
if (inet_pton(AF_INET, hostname, buf) == 1 ||
    inet_pton(AF_INET6, hostname, buf) == 1) {
    hints.ai_flags |= AI_NUMERICHOST;
}

API Reference

Core Functions

FunctionSignatureDescription
xDnsResolvexDnsQuery xDnsResolve(xEventLoop loop, const char *hostname, const char *service, const struct addrinfo *hints, xDnsCallback callback, void *arg)Start async DNS resolution
xDnsCancelvoid xDnsCancel(xEventLoop loop, xDnsQuery query)Cancel a pending query
xDnsResultFreevoid xDnsResultFree(xDnsResult *result)Free a resolution result

Types

TypeDescription
xDnsQueryOpaque handle to a pending query
xDnsResultResolution result: error + addrs linked list
xDnsAddrSingle resolved address node
xDnsCallbackvoid (*)(xDnsResult *result, void *arg)

xDnsResult Fields

FieldTypeDescription
errorxErrnoxErrno_Ok on success
addrsxDnsAddr *Linked list of addresses, or NULL

xDnsAddr Fields

FieldTypeDescription
addrstruct sockaddr_storageResolved socket address
addrlensocklen_tLength of the address
familyintAF_INET or AF_INET6
socktypeintSOCK_STREAM or SOCK_DGRAM
protocolintIPPROTO_TCP or IPPROTO_UDP
nextxDnsAddr *Next address, or NULL

Parameter Details for xDnsResolve

ParameterRequiredDescription
loopYesEvent loop (must not be NULL)
hostnameYesHostname or IP literal (non-empty)
serviceNoPort string (e.g. "443") or NULL
hintsNoaddrinfo hints; NULL defaults to AF_UNSPEC + SOCK_STREAM
callbackYesCompletion callback (must not be NULL)
argNoUser argument forwarded to callback

Returns a xDnsQuery handle, or NULL on invalid arguments.

Usage Examples

Basic Resolution

#include <stdio.h>
#include <arpa/inet.h>
#include <xbase/event.h>
#include <xnet/dns.h>

static void on_resolved(xDnsResult *result, void *arg) {
    xEventLoop loop = (xEventLoop)arg;

    if (result->error != xErrno_Ok) {
        fprintf(stderr, "DNS failed: %d\n", result->error);
        xDnsResultFree(result);
        xEventLoopStop(loop);
        return;
    }

    for (xDnsAddr *a = result->addrs; a; a = a->next) {
        char buf[INET6_ADDRSTRLEN];
        if (a->family == AF_INET) {
            struct sockaddr_in *sin = (struct sockaddr_in *)&a->addr;
            inet_ntop(AF_INET, &sin->sin_addr, buf, sizeof(buf));
        } else {
            struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&a->addr;
            inet_ntop(AF_INET6, &sin6->sin6_addr, buf, sizeof(buf));
        }
        printf("  %s (family=%d)\n", buf, a->family);
    }

    xDnsResultFree(result);
    xEventLoopStop(loop);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xDnsResolve(loop, "example.com", "443", NULL, on_resolved, loop);
    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

IPv4-Only Resolution

struct addrinfo hints = {0};
hints.ai_family   = AF_INET;
hints.ai_socktype = SOCK_STREAM;

xDnsResolve(loop, "example.com", "80", &hints, on_resolved, loop);```

### Cancelling a Query

```c
xDnsQuery q = xDnsResolve(loop, "slow.example.com", NULL, NULL, on_resolved, NULL);
// Cancel immediately — callback will NOT fire
xDnsCancel(loop, q);

IP Literal (No DNS Lookup)

// Resolves instantly via AI_NUMERICHOST
xDnsResolve(loop, "127.0.0.1", "8080", NULL, on_resolved, loop);

xDnsResolve(loop, "::1", "8080", NULL, on_resolved, loop);

Thread Safety

OperationThread Safety
xDnsResolve()Call from event loop thread only
xDnsCancel()Call from event loop thread only
xDnsResultFree()Call from any thread (result is owned)
xDnsCallbackAlways invoked on event loop thread

Error Handling

ScenarioBehavior
NULL loop, hostname, or callbackReturns NULL (no query created)
Empty hostnameReturns NULL
malloc failureReturns NULL
getaddrinfo() failureCallback receives result->error != xErrno_Ok
Cancelled queryCallback is not invoked; result is freed internally

Best Practices

  • Always call xDnsResultFree() in your callback. The callee owns the result.
  • Check result->error before iterating addrs. On failure, addrs is NULL.
  • Use xDnsCancel() for cleanup. If you destroy the object that owns the callback context, cancel the query first to prevent a use-after-free.
  • Pass NULL hints for typical use. The defaults (AF_UNSPEC + SOCK_STREAM) cover most HTTP/WebSocket connection scenarios.
  • xDnsCancel(loop, NULL) is safe — it's a no-op, so you don't need to guard against NULL handles.

tcp.h — Async TCP Connection, Connector & Listener

Introduction

tcp.h provides three async TCP building blocks on top of xKit's event loop:

  • xTcpConn — a thin resource wrapper that pairs an xSocket with an xTransport, plus convenience Recv/Send/SendIov helpers.
  • xTcpConnect — an async connector that performs DNS → socket → non-blocking connect → optional TLS handshake, delivering a ready-to-use xTcpConn via callback.
  • xTcpListener — an async listener that accepts connections (with optional TLS) and delivers each as an xTcpConn.

All callbacks run on the event loop thread, consistent with the rest of xKit.

Design Philosophy

  1. Resource Wrapper, Not Callback Framework — Unlike xWsCallbacks, we intentionally do not provide on_data / on_close callbacks at the TCP layer. WebSocket callbacks work well because the protocol defines message boundaries, close handshakes, and ping/pong — the library does real work before invoking user code. Raw TCP is a byte stream with no framing; an on_data callback would still deliver arbitrary fragments, leaving the user to reassemble and parse — no better than calling xTcpConnRecv directly. Instead, users register their own xSocketFunc callback via xSocketSetCallback() and drive I/O with xTcpConnRecv / xTcpConnSend.

  2. Transport TransparencyxTcpConn wraps an xTransport vtable. For plain TCP, read/writev map to read(2)/writev(2). For TLS, they map to SSL_read/SSL_write. The Recv/Send/SendIov helpers hide this detail so users never need to reach into xTransport internals.

  3. Full Async Connector PipelinexTcpConnect chains DNS resolution → socket creation → non-blocking connect() → optional TLS handshake into a single async operation with a timeout. Each phase is driven by event loop callbacks.

  4. Ownership TransferxTcpConnTakeSocket and xTcpConnTakeTransport allow higher-level protocols (e.g. WebSocket upgrade) to extract the underlying resources without closing them.

Architecture

Connector State Machine

stateDiagram-v2
    [*] --> DNS: xTcpConnect()
    DNS --> TcpConnect: resolved
    DNS --> Failed: DNS error

    TcpConnect --> TlsHandshake: connected + TLS configured
    TcpConnect --> Succeed: connected (plain TCP)
    TcpConnect --> Failed: connect error

    TlsHandshake --> Succeed: handshake done
    TlsHandshake --> Failed: handshake error

    Succeed --> [*]: callback(conn, Ok)
    Failed --> [*]: callback(NULL, err)

    note right of DNS: Async via xDnsResolve
    note right of TcpConnect: Non-blocking connect()
    note right of TlsHandshake: Async SSL_do_handshake

Listener Accept Flow

sequenceDiagram
    participant EL as Event Loop
    participant L as xTcpListener
    participant PC as PendingConn (TLS only)
    participant App as User Callback

    EL->>L: xEvent_Read (new connection)
    L->>L: accept()

    alt Plain TCP
        L->>App: callback(listener, conn, addr)
    else TLS
        L->>PC: create PendingConn
        loop Handshake rounds
            EL->>PC: xEvent_Read / xEvent_Write
            PC->>PC: SSL_do_handshake()
        end
        PC->>App: callback(listener, conn, addr)
    end

xTcpConn Resource Ownership

graph LR
    CONN["xTcpConn"]
    SOCK["xSocket<br/>(event loop registration)"]
    TP["xTransport<br/>(plain / TLS vtable)"]
    FD["fd"]

    CONN --> SOCK
    CONN --> TP
    SOCK --> FD

    style CONN fill:#4a90d9,color:#fff
    style SOCK fill:#50b86c,color:#fff
    style TP fill:#f5a623,color:#fff

xTcpConnClose() destroys in order: transport → socket → conn shell. Use xTcpConnTakeSocket() / xTcpConnTakeTransport() to extract resources before closing.

API Reference

xTcpConn — Connection

FunctionSignatureDescription
xTcpConnRecvssize_t xTcpConnRecv(xTcpConn conn, void *buf, size_t len)Read up to len bytes; returns bytes read, 0 on EOF, -1 on error
xTcpConnSendssize_t xTcpConnSend(xTcpConn conn, const char *buf, size_t len)Write len bytes; returns bytes written, -1 on error
xTcpConnSendIovssize_t xTcpConnSendIov(xTcpConn conn, const struct iovec *iov, int iovcnt)Scatter-gather write; returns total bytes written, -1 on error
xTcpConnTransportxTransport *xTcpConnTransport(xTcpConn conn)Get the internal transport vtable
xTcpConnSocketxSocket xTcpConnSocket(xTcpConn conn)Get the underlying socket handle
xTcpConnTakeSocketxSocket xTcpConnTakeSocket(xTcpConn conn)Extract socket ownership (conn no longer owns it)
xTcpConnTakeTransportxTransport xTcpConnTakeTransport(xTcpConn conn)Extract transport ownership (conn no longer owns it)
xTcpConnReaderxReader xTcpConnReader(xTcpConn conn)Get an xReader adapter bound to the connection's transport (see io.h)
xTcpConnWriterxWriter xTcpConnWriter(xTcpConn conn)Get an xWriter adapter bound to the connection's transport (see io.h)
xTcpConnClosevoid xTcpConnClose(xEventLoop loop, xTcpConn conn)Close connection and free all resources

xTcpConnect — Async Connector

FunctionSignatureDescription
xTcpConnectxErrno xTcpConnect(xEventLoop loop, const char *host, uint16_t port, const xTcpConnectConf *conf, xTcpConnectFunc callback, void *arg)Initiate async TCP connection

xTcpConnectConf Fields

FieldTypeDefaultDescription
tls_ctxxTlsCtxNULLPre-created shared TLS context (preferred); NULL for plain TCP or auto-create from tls
tlsconst xTlsConf *NULLTLS config for auto-created ctx; ignored when tls_ctx is set; NULL for plain TCP
timeout_msint10000Connect timeout in milliseconds
nodelayint0Set TCP_NODELAY if non-zero
keepaliveint0Set SO_KEEPALIVE if non-zero

TLS context resolution order: tls_ctx (shared, not owned) → auto-create from tls → defaults (system CA, verify enabled). When tls_ctx is provided, the connector does not create or destroy the context — the caller retains ownership.

xTcpConnectFunc

typedef void (*xTcpConnectFunc)(xTcpConn conn, xErrno err, void *arg);

On success: conn is valid, err is xErrno_Ok. On failure: conn is NULL, err indicates the error.

xTcpListener — Async Listener

FunctionSignatureDescription
xTcpListenerCreatexTcpListener xTcpListenerCreate(xEventLoop loop, const char *host, uint16_t port, const xTcpListenerConf *conf, xTcpListenerFunc callback, void *arg)Create and start a TCP listener
xTcpListenerDestroyvoid xTcpListenerDestroy(xTcpListener listener)Stop listening and free resources

xTcpListenerConf Fields

FieldTypeDefaultDescription
tls_ctxxTlsCtxNULLTLS context from xTlsCtxCreate(); NULL for plain TCP
backlogint128listen() backlog
reuseportint0Set SO_REUSEPORT if non-zero

xTcpListenerFunc

typedef void (*xTcpListenerFunc)(xTcpListener listener, xTcpConn conn,
                                 const struct sockaddr *addr, socklen_t addrlen,
                                 void *arg);

Invoked for each accepted connection. The callee takes ownership of conn.

Usage Examples

Echo Server

#include <string.h>
#include <xbase/event.h>
#include <xbase/socket.h>
#include <xnet/tcp.h>

static void on_conn_event(xSocket sock, xEventMask mask, void *arg) {
    xTcpConn conn = (xTcpConn)arg;
    (void)sock;

    if (mask & xEvent_Read) {
        char buf[4096];
        ssize_t n = xTcpConnRecv(conn, buf, sizeof(buf));
        if (n > 0) {
            xTcpConnSend(conn, buf, (size_t)n);
        } else {
            /* EOF or error: close */
            xTcpConnClose(xSocketLoop(sock), conn);
        }
    }
}

static void on_accept(xTcpListener listener, xTcpConn conn,
                      const struct sockaddr *addr, socklen_t addrlen,
                      void *arg) {
    (void)listener; (void)addr; (void)addrlen; (void)arg;

    /* Register our own event callback on the connection's socket */
    xSocket sock = xTcpConnSocket(conn);
    xSocketSetCallback(sock, on_conn_event, conn);
    /* Socket is already registered for xEvent_Read by default */
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xTcpListener listener =
        xTcpListenerCreate(loop, "0.0.0.0", 8080, NULL, on_accept, NULL);
    if (!listener) return 1;

    xEventLoopRun(loop);

    xTcpListenerDestroy(listener);
    xEventLoopDestroy(loop);
    return 0;
}

Async Client

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xbase/socket.h>
#include <xnet/tcp.h>

static void on_response(xSocket sock, xEventMask mask, void *arg) {
    xTcpConn conn = (xTcpConn)arg;
    xEventLoop loop = (xEventLoop)xSocketLoop(sock);
    (void)mask;

    char buf[4096];
    ssize_t n = xTcpConnRecv(conn, buf, sizeof(buf));
    if (n > 0) {
        printf("Received: %.*s\n", (int)n, buf);
    }
    xTcpConnClose(loop, conn);
    xEventLoopStop(loop);
}

static void on_connected(xTcpConn conn, xErrno err, void *arg) {
    xEventLoop loop = (xEventLoop)arg;
    if (err != xErrno_Ok) {
        fprintf(stderr, "Connect failed: %d\n", err);
        xEventLoopStop(loop);
        return;
    }

    /* Send a request */
    const char *msg = "Hello, server!";
    xTcpConnSend(conn, msg, strlen(msg));

    /* Wait for response */
    xSocket sock = xTcpConnSocket(conn);
    xSocketSetCallback(sock, on_response, conn);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xTcpConnectConf conf = {0};
    conf.nodelay = 1;

    xTcpConnect(loop, "127.0.0.1", 8080, &conf, on_connected, loop);
    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

TLS Client (auto-create context)

#include <xnet/tcp.h>
#include <xnet/tls.h>

static void on_tls_connected(xTcpConn conn, xErrno err, void *arg) {
    if (err != xErrno_Ok) { /* handle error */ return; }

    /* TLS is already established — Recv/Send are transparently encrypted */
    const char *msg = "GET / HTTP/1.1\r\nHost: example.com\r\n\r\n";
    xTcpConnSend(conn, msg, strlen(msg));
    /* ... register read callback ... */
}

void connect_tls(xEventLoop loop) {
    xTlsConf tls = {0};
    tls.ca = "/etc/ssl/certs/ca-certificates.crt";

    xTcpConnectConf conf = {0};
    conf.tls = &tls;

    xTcpConnect(loop, "example.com", 443, &conf, on_tls_connected, loop);
}

TLS Client (shared context)

When making many connections to the same server, share a xTlsCtx to avoid reloading certificates each time:

#include <xnet/tcp.h>
#include <xnet/tls.h>

static void on_connected(xTcpConn conn, xErrno err, void *arg) {
    if (err != xErrno_Ok) { /* handle error */ return; }
    /* ... use conn ... */
}

void connect_with_shared_ctx(xEventLoop loop) {
    // Create once, reuse for all connections
    xTlsConf tls = {0};
    tls.ca = "ca.pem";
    xTlsCtx ctx = xTlsCtxCreate(&tls);

    xTcpConnectConf conf = {0};
    conf.tls_ctx = ctx;  // shared, not owned by connector

    xTcpConnect(loop, "example.com", 443, &conf, on_connected, loop);
    xTcpConnect(loop, "example.com", 443, &conf, on_connected, loop);

    // ... later, after all connections are closed ...
    xTlsCtxDestroy(ctx);
}

TLS Server

#include <xnet/tcp.h>
#include <xnet/transport.h>

void start_tls_server(xEventLoop loop) {
    xTlsConf tls_conf = {
        .cert = "server.pem",
        .key  = "server-key.pem",
    };
    xTlsCtx tls_ctx = xTlsCtxCreate(&tls_conf);

    xTcpListenerConf conf = {0};
    conf.tls_ctx = tls_ctx;

    xTcpListener listener =
        xTcpListenerCreate(loop, "0.0.0.0", 8443, &conf, on_accept, NULL);
    /* ... run event loop ... */

    xTcpListenerDestroy(listener);
    xTlsCtxDestroy(tls_ctx);
}

Ownership Transfer (Protocol Upgrade)

/* After receiving an HTTP upgrade response on a TCP connection,
 * extract the socket and transport for the new protocol layer. */
xSocket    sock = xTcpConnTakeSocket(conn);
xTransport tp   = xTcpConnTakeTransport(conn);

/* Close the empty conn shell (no-op on resources) */
xTcpConnClose(loop, conn);

/* sock and tp are now owned by the new protocol handler */

Thread Safety

OperationThread Safety
xTcpConnect()Call from event loop thread only
xTcpListenerCreate()Call from event loop thread only
xTcpListenerDestroy()Call from event loop thread only
xTcpConnRecv/Send/SendIov()Call from event loop thread only
xTcpConnClose()Call from event loop thread only
xTcpConnectFunc callbackAlways invoked on event loop thread
xTcpListenerFunc callbackAlways invoked on event loop thread

Error Handling

ScenarioBehavior
NULL loop, host, or callback in xTcpConnectReturns xErrno_InvalidArg
DNS resolution failureCallback receives xErrno_DnsError or xErrno_DnsNotFound
connect() failureCallback receives xErrno_SysError
TLS handshake failureCallback receives xErrno_SysError
Connect timeoutCallback receives xErrno_Timeout
xTcpListenerCreate bind/listen failureReturns NULL
xTcpConnRecv/Send on NULL connReturns -1
xTcpConnClose(loop, NULL)No-op (safe)
xTcpListenerDestroy(NULL)No-op (safe)

Best Practices

  • Always close connections with xTcpConnClose() — it destroys the transport (TLS cleanup), removes the socket from the event loop, closes the fd, and frees the conn.
  • Register your own xSocketFunc on the connection's socket via xSocketSetCallback() to receive read/write events, then use xTcpConnRecv / xTcpConnSend inside the callback.
  • Use xTcpConnSendIov for multi-buffer writes (e.g. header + body) to avoid copying into a single buffer.
  • Set nodelay = 1 in xTcpConnectConf for latency-sensitive protocols (HTTP, WebSocket).
  • Use xTcpConnTakeSocket / xTcpConnTakeTransport when upgrading protocols (e.g. HTTP → WebSocket) to avoid double-free.
  • Cancel or close before freeing context — if you destroy the object that owns the connect callback context, ensure the connection attempt has completed or timed out first.

tls.h — TLS Configuration Types

Introduction

tls.h defines xTlsConf, the unified TLS configuration structure shared across xKit modules, and xTlsCtx, the opaque handle to a server-level TLS context. It controls certificate loading, peer verification, and optional ALPN negotiation for both client-side and server-side TLS. These are the central TLS abstractions — the actual TLS handshake is handled by the TLS backend (OpenSSL or mbedTLS) in the transport layer.

Design Philosophy

  1. Backend-Agnostic — The config struct contains only file paths and flags. It works identically whether the TLS backend is OpenSSL or mbedTLS.

  2. Zero-Initialize for Defaults — A zero-initialized xTlsConf uses the system CA bundle with full peer and host verification enabled. This is the secure default for both client and server.

  3. Unified Client/Server — A single xTlsConf struct serves both roles. Client-only fields (key_password) and server-only fields (alpn) are simply left as NULL / zero when unused.

  4. Separation of Concerns — TLS configuration is defined in xnet (the networking primitives layer) and consumed by xhttp (the HTTP layer). This avoids circular dependencies and allows future modules to reuse the same types.

API Reference

xTlsConf

Unified TLS configuration for both client and server.

FieldTypeDefaultDescription
certconst char *NULL (none)Path to PEM certificate file
keyconst char *NULL (none)Path to PEM private key file
caconst char *NULL (system CA)Path to CA certificate file
key_passwordconst char *NULL (none)Private key password (client-side)
alpnconst char **NULL (none)NULL-terminated ALPN protocol list (server-side)
skip_verifyint0 (verify)Non-zero to skip peer & host verification

Backward-compatible aliases: xTlsClientConf and xTlsServerConf are typedef'd to xTlsConf.

xTlsCtx

Opaque handle to a shared TLS context. Created by xTlsCtxCreate(), used by both server-side listeners (xTcpListenerConf.tls_ctx) and client-side connectors (xTcpConnectConf.tls_ctx, xWsConnectConf.tls_ctx). Shared across all connections that use the same context. Destroyed by xTlsCtxDestroy(). Supports certificate hot-reload via xTlsCtxReload().

xTlsCtxCreate

xTlsCtx xTlsCtxCreate(const xTlsConf *conf);

Create a shared TLS context. Loads the certificate (if provided), private key (if provided), optional CA, and optional ALPN list. The returned context can be shared across all connections that use the same TLS configuration.

  • conf — TLS configuration (must not be NULL). For server-side use, cert and key are required. For client-side use, only ca (or defaults) is needed.
  • Returns a TLS context handle, or NULL on failure.

xTlsCtxDestroy

void xTlsCtxDestroy(xTlsCtx ctx);

Destroy a shared TLS context and release all resources. Safe to call with NULL (no-op). Must only be called after all connections using this context have been closed.

xTlsCtxReload

int xTlsCtxReload(xTlsCtx ctx, const xTlsConf *conf);

Hot-reload certificates for an existing TLS context. Atomically replaces the certificate, private key, and optional CA. Existing connections are not affected; only new connections will use the updated certificates.

  • ctx — TLS context to reload (must not be NULL).
  • conf — New TLS configuration (must not be NULL, cert and key must not be NULL).
  • Returns 0 on success, -1 on failure (context unchanged).

Example: Certificate hot-reload

// Initial setup
xTlsConf tls = {
    .cert = "server.pem",
    .key  = "server-key.pem",
    .alpn = (const char *[]){"h2", "http/1.1", NULL},
};
xTlsCtx ctx = xTlsCtxCreate(&tls);

// ... later, when certificates are renewed ...
xTlsConf new_tls = {
    .cert = "server-new.pem",
    .key  = "server-key-new.pem",
    .alpn = (const char *[]){"h2", "http/1.1", NULL},
};
if (xTlsCtxReload(ctx, &new_tls) == 0) {
    // New connections will use the updated certificates
}

One-Way TLS (Client Verifies Server)

#include <xnet/tls.h>
#include <xhttp/client.h>

// Use system CA bundle (zero-init)
xTlsConf tls = {0};
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);

// Or specify a CA file
xTlsConf tls_ca = {0};
tls_ca.ca = "ca.pem";
xHttpClientConf conf_ca = {.tls = &tls_ca};
xHttpClient client2 = xHttpClientCreate(loop, &conf_ca);

Skip Verification (Development Only)

xTlsConf tls = {0};
tls.skip_verify = 1;  // DANGER: disables all checks
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);

Mutual TLS (mTLS)

// Server: require client certificate (default: verify enabled)
xTlsConf server_tls = {
    .cert = "server.pem",
    .key  = "server-key.pem",
    .ca   = "ca.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &server_tls);

// Client: present certificate
xTlsConf client_tls = {0};
client_tls.ca   = "ca.pem";
client_tls.cert = "client.pem";
client_tls.key  = "client-key.pem";
xHttpClientConf client_conf = {
    .tls = &client_tls,
};
xHttpClient client = xHttpClientCreate(loop, &client_conf);

Password-Protected Private Key

xTlsConf tls = {0};
tls.ca           = "ca.pem";
tls.cert         = "client.pem";
tls.key          = "client-key-enc.pem";
tls.key_password = "my-secret";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client = xHttpClientCreate(loop, &conf);

Relationship with Other Modules

  • xnetxTlsCtxCreate() / xTlsCtxDestroy() / xTlsCtxReload() are declared in tls.h and implemented in the TLS backend files (transport_openssl.c, transport_mbedtls.c). The TCP listener uses xTlsCtx via xTcpListenerConf.tls_ctx, and the TCP connector uses it via xTcpConnectConf.tls_ctx.
  • xhttp — The HTTP server calls xTlsCtxCreate() internally when xHttpServerListenTls() is invoked, automatically setting ALPN to {"h2", "http/1.1"}. The HTTP client uses libcurl for TLS management and consumes xTlsConf directly. The WebSocket client supports both xTlsConf (auto-creates a context) and a pre-created xTlsCtx (shared across connections) via xWsConnectConf.tls_ctx. See the TLS Deployment Guide for end-to-end examples.

Security Notes

  • Never use skip_verify = 1 in production. It disables all certificate validation.
  • Keep private keys secure. Use restrictive file permissions (chmod 600).
  • For mTLS, set ca to the signing CA on the server side. Zero-initialized skip_verify means verification is enabled by default.
  • The config struct does not copy strings. The caller must ensure that file path strings remain valid until xHttpClientCreate() or xHttpServerListenTls() returns (the library deep-copies them internally).

xhttp — Asynchronous HTTP

Introduction

xhttp is xKit's HTTP module, providing both a fully asynchronous HTTP client and server, all powered by xbase's event loop.

  • The client uses libcurl's multi-socket API for non-blocking HTTP requests and SSE streaming — ideal for integrating with REST APIs and LLM streaming endpoints. Supports TLS configuration including custom CA certificates, mutual TLS (mTLS), and certificate verification control via xTlsConf.
  • The server uses an xHttpProto vtable interface for protocol-abstracted parsing, supporting both HTTP/1.1 (llhttp) and HTTP/2 (nghttp2, h2c Prior Knowledge) on the same port. TLS listeners are supported via xHttpServerListenTls with xTlsConf. Single-threaded, event-driven connection handling — ideal for building lightweight HTTP services and APIs.
  • WebSocket support includes both server and client. On the server side, call xWsUpgrade() inside a regular HTTP handler to perform the RFC 6455 upgrade handshake. On the client side, use xWsConnect() to establish an async WebSocket connection to a remote endpoint. The library handles frame codec, ping/pong, fragment reassembly, and close negotiation automatically for both sides.

Design Philosophy

  1. Event Loop Integration — Instead of blocking threads, xhttp registers libcurl's sockets with xEventLoop and uses event-driven I/O. All callbacks are dispatched on the event loop thread, eliminating the need for synchronization.

  2. Vtable-Based Request Polymorphism — Internally, different request types (oneshot HTTP, SSE streaming) share the same curl multi handle but use different vtables for completion and cleanup. This avoids code duplication while supporting diverse response handling patterns.

  3. Zero-Copy Response Delivery — Response headers and body are accumulated in xBuffer instances and delivered to the callback as pointers. No extra copies are made.

  4. Automatic Resource Management — Request contexts, curl easy handles, and buffers are automatically cleaned up after the completion callback returns. In-flight requests are cancelled with error callbacks when the client is destroyed.

Architecture

graph TD
    subgraph "Application"
        APP["User Code"]
    end

    subgraph "xhttp"
        CLIENT["xHttpClient"]
        TLS_CLI["TLS Config<br/>(xTlsConf)"]
        ONESHOT["Oneshot Request<br/>(GET/POST/Do)"]
        SSE["SSE Request<br/>(GetSse/DoSse)"]
        PARSER["SSE Parser<br/>(W3C spec)"]
    end

    subgraph "libcurl"
        MULTI["curl_multi"]
        EASY1["curl_easy (req 1)"]
        EASY2["curl_easy (req 2)"]
    end

    subgraph "xbase"
        LOOP["xEventLoop"]
        TIMER["Timer<br/>(curl timeout)"]
        FD["FD Events<br/>(socket I/O)"]
    end

    APP -->|"xHttpClientGet/Post/Do"| ONESHOT
    APP -->|"xHttpClientGetSse/DoSse"| SSE
    APP -->|"xHttpClientConf.tls"| TLS_CLI
    SSE --> PARSER
    ONESHOT --> CLIENT
    SSE --> CLIENT
    TLS_CLI --> CLIENT
    CLIENT --> MULTI
    MULTI --> EASY1
    MULTI --> EASY2
    MULTI -->|"CURLMOPT_SOCKETFUNCTION"| FD
    MULTI -->|"CURLMOPT_TIMERFUNCTION"| TIMER
    FD --> LOOP
    TIMER --> LOOP

    style CLIENT fill:#4a90d9,color:#fff
    style LOOP fill:#50b86c,color:#fff
    style MULTI fill:#f5a623,color:#fff

Sub-Module Overview

FileDescriptionDoc
server.hAsync HTTP/1.1 & HTTP/2 server (routing, request/response, protocol-abstracted parsing)server.md
client.hAsync HTTP client API (GET, POST, Do, SSE, TLS configuration)client.md
sse.cSSE stream parser and request handlersse.md
ws.h (server)WebSocket server API (upgrade, send, close, callbacks)ws_server.md
ws.h (client)WebSocket client API (connect, send, close, callbacks)ws_client.md
(guide)TLS deployment guide (certificate generation, one-way TLS, mTLS, troubleshooting)tls.md

Quick Start

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>

static void on_response(const xHttpResponse *resp, void *arg) {
    (void)arg;
    if (resp->curl_code == 0) {
        printf("Status: %ld\n", resp->status_code);
        printf("Body: %.*s\n", (int)resp->body_len, resp->body);
    } else {
        printf("Error: %s\n", resp->curl_error);
    }
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpClient client = xHttpClientCreate(loop, NULL);

    xHttpClientGet(client, "https://httpbin.org/get", on_response, NULL);

    xEventLoopRun(loop);

    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

Relationship with Other Modules

  • xbase — Uses xEventLoop for I/O multiplexing and xEventLoopTimerAfter for curl timeout management.
  • xbuf — Uses xBuffer for response header and body accumulation.
  • libcurl — External dependency (client). Uses the multi-socket API (curl_multi_socket_action) for non-blocking HTTP.
  • llhttp — External dependency (server). Provides incremental HTTP/1.1 request parsing, isolated behind the xHttpProto vtable in proto_h1.c.
  • nghttp2 — External dependency (server). Provides HTTP/2 frame processing and HPACK header compression, isolated behind the xHttpProto vtable in proto_h2.c.

client.h — Asynchronous HTTP Client

Introduction

client.h provides xHttpClient, an asynchronous HTTP client that integrates libcurl's multi-socket API with xbase's event loop. All network I/O is non-blocking and driven by the event loop; completion callbacks are dispatched on the event loop thread. The client supports GET, POST, PUT, DELETE, PATCH, HEAD methods and Server-Sent Events (SSE) streaming.

Design Philosophy

  1. libcurl Multi-Socket Integration — Rather than using libcurl's easy (blocking) API or multi-perform (polling) API, xhttp uses the multi-socket API (CURLMOPT_SOCKETFUNCTION + CURLMOPT_TIMERFUNCTION). This allows libcurl to delegate socket monitoring to xEventLoop, achieving true event-driven I/O without dedicated threads.

  2. Single-Threaded Callback Model — All callbacks (response, SSE events, done) are invoked on the event loop thread. No locks are needed in callback code.

  3. Vtable-Based Polymorphism — Internally, each request carries a vtable (xHttpReqVtable) with on_done and on_cleanup function pointers. Oneshot requests and SSE requests use different vtables, sharing the same curl multi handle and completion infrastructure.

  4. Automatic Body Copy — POST/PUT request bodies are copied internally (malloc + memcpy), so the caller doesn't need to keep the body alive after submitting the request.

Architecture

graph TD
    subgraph xHttpClientInternal[xHttpClient Internal]
        MULTI[curl multi handle]
        TIMER_CB[timer callback - CURLMOPT TIMERFUNCTION]
        SOCKET_CB[socket callback - CURLMOPT SOCKETFUNCTION]
        CHECK[check multi info]
    end

    subgraph PerRequest[Per Request]
        REQ[xHttpReq]
        EASY[curl easy handle]
        BODY[xBuffer body]
        HDR[xBuffer headers]
        VT[vtable - oneshot or SSE]
    end

    subgraph xbaseEventLoop[xbase Event Loop]
        LOOP[xEventLoop]
        FD_EVT[FD events]
        TIMER_EVT[Timer events]
    end

    SOCKET_CB --> FD_EVT
    TIMER_CB --> TIMER_EVT
    FD_EVT --> LOOP
    TIMER_EVT --> LOOP
    LOOP -->|fd ready| CHECK
    LOOP -->|timeout| CHECK
    CHECK --> VT
    VT -->|on done| APP[User Callback]

    REQ --> EASY
    REQ --> BODY
    REQ --> HDR
    REQ --> VT

    style MULTI fill:#f5a623,color:#fff
    style LOOP fill:#50b86c,color:#fff

Implementation Details

libcurl + xEventLoop Integration

sequenceDiagram
    participant App as Application
    participant Client as xHttpClient
    participant Curl as CurlMulti
    participant L as xEventLoop

    App->>Client: xHttpClientGet url cb
    Client->>Curl: curl multi add handle
    Curl->>Client: socket callback fd POLL IN
    Client->>L: xEventAdd fd Read
    Note over L: Event loop polls
    L->>Client: fd ready callback
    Client->>Curl: curl multi socket action
    Curl->>Client: write callback data
    Client->>Client: xBufferAppend body buf data
    Note over Curl: Transfer complete
    Client->>Client: check multi info
    Client->>App: on response resp

Socket Callback Flow

When libcurl needs to monitor a socket, it calls socket_callback:

  1. CURL_POLL_REMOVE — Unregister the fd from the event loop (xEventDel).
  2. CURL_POLL_IN/OUT/INOUT — Register or update the fd with the event loop (xEventAdd/xEventMod).

Each socket gets an xHttpSocketCtx_ that maps the fd to the client and event source.

Timer Callback Flow

When libcurl needs a timeout:

  1. timeout_ms == -1 — Cancel any existing timer.
  2. timeout_ms == 0 — Schedule a 1ms timer (deferred to avoid reentrant curl_multi_socket_action).
  3. timeout_ms > 0 — Schedule a timer via xEventLoopTimerAfter.

When the timer fires, curl_multi_socket_action(CURL_SOCKET_TIMEOUT) is called.

Request Lifecycle

stateDiagram-v2
    [*] --> Created: xHttpClientGet/Post/Do
    Created --> Submitted: curl_multi_add_handle
    Submitted --> InFlight: Event loop drives I/O
    InFlight --> Completed: curl reports CURLMSG_DONE
    Completed --> CallbackInvoked: on_response(resp)
    CallbackInvoked --> CleanedUp: free buffers + easy handle
    CleanedUp --> [*]

    InFlight --> Aborted: xHttpClientDestroy
    Aborted --> CallbackInvoked: on_response(error)

Response Structure

XDEF_STRUCT(xHttpResponse) {
    long        status_code;  // HTTP status (200, 404, etc.), 0 on failure
    const char *headers;      // Raw headers (NUL-terminated)
    size_t      headers_len;
    const char *body;         // Response body (NUL-terminated)
    size_t      body_len;
    int         curl_code;    // CURLcode (0 = success)
    const char *curl_error;   // Human-readable error, or NULL
};

All pointers are valid only during the callback. The library manages their lifetime.

API Reference

Types

TypeDescription
xHttpClientOpaque handle to an HTTP client bound to an event loop
xHttpClientConfConfiguration struct for creating a client (TLS, HTTP version)
xHttpResponseResponse data delivered to the completion callback
xHttpResponseFuncvoid (*)(const xHttpResponse *resp, void *arg)
xHttpMethodEnum: GET, POST, PUT, DELETE, PATCH, HEAD
xHttpRequestConfConfiguration struct for generic requests
xSseEventSSE event data delivered to the event callback
xSseEventFuncint (*)(const xSseEvent *ev, void *arg) — return 0 to continue, non-zero to close
xSseDoneFuncvoid (*)(int curl_code, void *arg)
xTlsConfTLS configuration for the client (CA path, client cert/key, skip verify)

Lifecycle

FunctionSignatureDescriptionThread Safety
xHttpClientCreatexHttpClient xHttpClientCreate(xEventLoop loop, const xHttpClientConf *conf)Create a client bound to an event loop. Pass NULL for defaults.Not thread-safe
xHttpClientDestroyvoid xHttpClientDestroy(xHttpClient client)Destroy client. In-flight requests get error callbacks.Not thread-safe

TLS Configuration

TLS is configured at client creation time via xHttpClientConf. The xTlsConf fields are deep-copied internally; the caller does not need to keep them alive after creation.

xTlsConf Fields (Client)

FieldTypeDescription
caconst char *Path to a CA certificate file for server verification. When set, the system CA bundle is bypassed.
certconst char *Path to a client certificate file (PEM) for mutual TLS (mTLS).
keyconst char *Path to the client private key file (PEM) for mTLS.
key_passwordconst char *Passphrase for an encrypted client private key.
skip_verifyintIf non-zero, skip server certificate verification (useful for self-signed certs in development).

All string fields are deep-copied internally; the caller does not need to keep them alive after the call.

Convenience Requests

FunctionSignatureDescriptionThread Safety
xHttpClientGetxErrno xHttpClientGet(xHttpClient client, const char *url, xHttpResponseFunc on_response, void *arg)Async GET request.Not thread-safe
xHttpClientPostxErrno xHttpClientPost(xHttpClient client, const char *url, const char *body, size_t body_len, xHttpResponseFunc on_response, void *arg)Async POST request. Body is copied internally.Not thread-safe

Generic Request

FunctionSignatureDescriptionThread Safety
xHttpClientDoxErrno xHttpClientDo(xHttpClient client, const xHttpRequestConf *config, xHttpResponseFunc on_response, void *arg)Fully-configured async request.Not thread-safe

SSE Requests

FunctionSignatureDescriptionThread Safety
xHttpClientGetSsexErrno xHttpClientGetSse(xHttpClient client, const char *url, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg)Subscribe to SSE endpoint (GET).Not thread-safe
xHttpClientDoSsexErrno xHttpClientDoSse(xHttpClient client, const xHttpRequestConf *config, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg)Fully-configured SSE request (e.g., POST for LLM APIs).Not thread-safe

Usage Examples

Simple GET Request

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>

static void on_response(const xHttpResponse *resp, void *arg) {
    (void)arg;
    if (resp->curl_code == 0) {
        printf("HTTP %ld\n", resp->status_code);
        printf("%.*s\n", (int)resp->body_len, resp->body);
    } else {
        printf("Error: %s\n", resp->curl_error);
    }
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpClient client = xHttpClientCreate(loop, NULL);

    xHttpClientGet(client, "https://httpbin.org/get", on_response, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

HTTPS with TLS Configuration

#include <xbase/event.h>
#include <xhttp/client.h>

static void on_response(const xHttpResponse *resp,
                        void *arg) {
    (void)arg;
    printf("Status: %ld\n", resp->status_code);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    // Skip certificate verification (dev only)
    xTlsConf tls = {0};
    tls.skip_verify = 1;
    xHttpClientConf conf = {.tls = &tls};
    xHttpClient client =
        xHttpClientCreate(loop, &conf);

    xHttpClientGet(
        client,
        "https://secure.example.com/api",
        on_response, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

POST with Custom Headers

#include <xbase/event.h>
#include <xhttp/client.h>

static void on_response(const xHttpResponse *resp, void *arg) {
    (void)arg;
    printf("Status: %ld, Body: %.*s\n",
           resp->status_code, (int)resp->body_len, resp->body);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpClient client = xHttpClientCreate(loop, NULL);

    const char *headers[] = {
        "Content-Type: application/json",
        "Authorization: Bearer token123",
        NULL
    };

    xHttpRequestConf config = {
        .url       = "https://api.example.com/data",
        .method    = xHttpMethod_POST,
        .body      = "{\"key\": \"value\"}",
        .body_len  = 16,
        .headers   = headers,
        .timeout_ms = 5000,
    };

    xHttpClientDo(client, &config, on_response, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

Use Cases

  1. REST API Integration — Make async HTTP calls to microservices, cloud APIs, or webhooks from an event-driven C application.

  2. Secure Communication — Pass TLS config via xHttpClientConf at creation time to configure custom CA certificates, client certificates for mTLS, or skip verification for development environments with self-signed certs.

  3. LLM API Calls — Use xHttpClientDoSse() with POST method and JSON body to stream responses from OpenAI, Anthropic, or other LLM APIs. See sse.md for a complete example.

  4. Health Checks / Monitoring — Periodically poll HTTP endpoints using timer-driven GET requests within the event loop.

Best Practices

  • Don't block in callbacks. Callbacks run on the event loop thread. Blocking delays all other I/O.
  • Copy data you need to keep. Response pointers (body, headers) are only valid during the callback.
  • Use xHttpClientDo() for complex requests. The convenience helpers (Get/Post) are for simple cases; Do gives full control over method, headers, body, and timeout.
  • Destroy the client before the event loop. xHttpClientDestroy() cancels in-flight requests and invokes their callbacks with error status.
  • Check curl_code first. A curl_code of 0 means the HTTP transfer succeeded; then check status_code for the HTTP-level result.
  • Never use skip_verify in production. It disables all certificate validation. Use a proper CA path or system CA bundle instead.
  • TLS config is set at creation time. Pass xHttpClientConf with TLS settings when creating the client; it affects both oneshot and SSE requests. To change TLS config, destroy and recreate the client.

Comparison with Other Libraries

Featurexhttp client.hlibcurl easy APIcpp-httplibPython requests
I/O ModelAsync (event loop)BlockingBlockingBlocking
Event LoopxEventLoop integrationNone (or manual multi)NoneNone (asyncio separate)
SSE SupportBuilt-in (GetSse/DoSse)Manual parsingNoNo (needs sseclient)
TLS ConfigxHttpClientConf.tls at creationcurl_easy_setopt (manual)Built-inverify/cert params
Thread ModelSingle-threaded callbacksOne thread per requestOne thread per requestOne thread per request
MemoryAutomatic (xBuffer)Manual (WRITEFUNCTION)Automatic (std::string)Automatic (Python GC)
LanguageC99CC++Python

Key Differentiator: xhttp provides true event-loop-integrated async HTTP with built-in SSE support. Unlike libcurl's easy API (which blocks) or multi-perform API (which requires polling), xhttp uses the multi-socket API for zero-overhead integration with xEventLoop. The built-in SSE parser makes it uniquely suited for LLM API integration from C.

server.h — Asynchronous HTTP/1.1 & HTTP/2 Server

Introduction

server.h provides xHttpServer, an asynchronous, non-blocking HTTP server powered by xbase's event loop. The server supports both HTTP/1.1 and HTTP/2 (h2c, cleartext) on the same port, with automatic protocol detection via Prior Knowledge. The protocol parsing layer is abstracted behind an xHttpProto vtable interface — HTTP/1.1 uses llhttp, HTTP/2 uses nghttp2. All connection handling, request parsing, and response sending are driven by the event loop on a single thread — no locks or thread pools required. The server supports routing, keep-alive, configurable limits, automatic error responses, and TLS/HTTPS via xHttpServerListenTls() with pluggable TLS backends (OpenSSL or Mbed TLS).

Design Philosophy

  1. Single-Threaded Event-Driven I/O — The server registers listening and client sockets with xEventLoop. Accept, read, parse, dispatch, and write all happen on the event loop thread, eliminating synchronization overhead.

  2. Protocol-Abstracted Parsing — Request parsing is delegated to a protocol handler behind the xHttpProto vtable interface. HTTP/1.1 (proto_h1.c) uses llhttp; HTTP/2 (proto_h2.c) uses nghttp2. Incremental callbacks accumulate URL, headers, and body into xBuffer instances. This abstraction allows both protocols to share the same connection management, routing, and response serialization layers.

  3. Automatic Protocol Detection — On each new connection, the server inspects the first bytes of incoming data. If the 24-byte HTTP/2 connection preface (PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n) is detected, the connection is upgraded to HTTP/2; otherwise, HTTP/1.1 is used. This enables h2c (cleartext HTTP/2) via Prior Knowledge — ideal for internal service-to-service communication.

  4. First-Match Routing — Routes are registered as pattern strings (e.g. "GET /users/:id" or "/any") and matched in registration order. If the pattern starts with /, it matches any HTTP method; otherwise the first token is the method. Path patterns support both exact segments and :param segments.

  5. Writer-Based Response API — Handlers receive an xHttpResponseWriter handle to set status, headers, and body. The response is serialized into an xIOBuffer and flushed asynchronously, with backpressure handled automatically.

  6. Defensive Limits — Configurable limits on header size (default 8 KiB), body size (default 1 MiB), and idle timeout (default 60 s) protect against slow clients and oversized payloads. Violations produce appropriate 4xx error responses.

  7. Pluggable TLS — TLS support is provided via xHttpServerListenTls() with xTlsConf. The TLS backend (OpenSSL or Mbed TLS) is selected at compile time via XK_TLS_BACKEND. ALPN negotiation automatically selects HTTP/1.1 or HTTP/2 over TLS. Mutual TLS (mTLS) is supported when ca is set (verification is enabled by default).

Architecture

graph TD
    subgraph "Application"
        APP["User Code"]
        HANDLER["Handler Callback"]
    end

    subgraph "xhttp Server"
        SERVER["xHttpServer"]
        TLS["TLS Layer<br/>(OpenSSL / Mbed TLS)"]
        ROUTER["Route Table<br/>(linked list)"]
        CONN["xHttpConn_<br/>(per connection)"]
        DETECT["Protocol Detection<br/>(Prior Knowledge / ALPN)"]
        PROTO["xHttpProto (vtable)"]
        PARSER_H1["proto_h1 (llhttp)"]
        PARSER_H2["proto_h2 (nghttp2)"]
        STREAM["xHttpStream_<br/>(per request)"]
        WRITER["xHttpResponseWriter"]
    end

    subgraph "xbase"
        LOOP["xEventLoop"]
        SOCK["xSocket"]
        TIMER["Idle Timeout"]
    end

    APP -->|"xHttpServerRoute"| ROUTER
    APP -->|"xHttpServerListen<br/>xHttpServerListenTls"| SERVER
    SERVER -->|"accept()"| CONN
    SERVER -.->|"TLS handshake"| TLS
    TLS -.-> CONN
    CONN --> DETECT
    DETECT -->|"H1"| PARSER_H1
    DETECT -->|"H2 preface"| PARSER_H2
    PARSER_H1 --> PROTO
    PARSER_H2 --> PROTO
    PROTO -->|"request complete"| STREAM
    STREAM --> ROUTER
    ROUTER -->|"first match"| HANDLER
    HANDLER -->|"xHttpResponseSend"| WRITER
    WRITER --> STREAM
    STREAM -->|"H1: xIOBuffer / H2: nghttp2 frames"| CONN
    CONN --> SOCK
    SOCK --> LOOP
    TIMER --> LOOP

    style SERVER fill:#4a90d9,color:#fff
    style LOOP fill:#50b86c,color:#fff
    style PROTO fill:#9b59b6,color:#fff
    style PARSER_H1 fill:#f5a623,color:#fff
    style PARSER_H2 fill:#e74c3c,color:#fff
    style DETECT fill:#1abc9c,color:#fff
    style TLS fill:#2ecc71,color:#fff

Implementation Details

Connection Lifecycle

stateDiagram-v2
    [*] --> Accepted: accept() on listen fd
    Accepted --> Reading: xSocket registered (Read)
    Reading --> Parsing: Data received
    Parsing --> Dispatching: on_message_complete
    Dispatching --> HandlerRunning: Route matched
    Dispatching --> ErrorSent: No match (404/405)
    HandlerRunning --> ResponseQueued: xHttpResponseSend()
    ResponseQueued --> Flushing: conn_try_flush()
    Flushing --> KeepAlive: All written + keep-alive
    Flushing --> Backpressure: EAGAIN (register Write)
    Backpressure --> Flushing: Write event fires
    KeepAlive --> Reading: Reset parser state
    Flushing --> Closed: All written + !keep-alive
    ErrorSent --> Closed: Error responses close connection

    Reading --> Closed: Idle timeout
    Reading --> Closed: Client disconnect
    Reading --> Closed: Parse error (400)
    Parsing --> ErrorSent: Header too large (431)
    Parsing --> ErrorSent: Body too large (413)

Request Parsing Flow

sequenceDiagram
    participant Client
    participant Conn as xHttpConn_
    participant Proto as xHttpProto (vtable)
    participant Parser as proto_h1 (llhttp)
    participant Bufs as xBuffer (url/headers/body)
    participant Router as Route Table
    participant Handler as User Handler

    Client->>Conn: TCP data
    Conn->>Conn: xIOBufferReadFd()
    Conn->>Proto: proto.on_data(data)
    Proto->>Parser: llhttp_execute(data)
    Parser->>Bufs: on_url → xBufferAppend(url)
    Parser->>Bufs: on_header_field → xBufferAppend(headers_raw)
    Parser->>Bufs: on_header_value → xBufferAppend(headers_raw)
    Parser->>Bufs: on_body → xBufferAppend(body)
    Parser->>Proto: on_message_complete → return 1
    Proto->>Conn: return 1 (request complete)
    Conn->>Router: conn_dispatch_request()
    Router->>Handler: handler(writer, req, arg)
    Handler->>Conn: xHttpResponseSend(body)
    Conn->>Client: HTTP response (async flush)

Routing

Routes are stored in a singly-linked list and matched in registration order (first match wins):

  1. Path match — Segment-by-segment comparison. Static segments require exact match; :param segments match any non-empty string and capture the value.
  2. Method match — Case-insensitive comparison (strcasecmp). A pattern without a method prefix (e.g. "/any") matches any HTTP method.
  3. Fallback — If the path matches but no method matches → 405 Method Not Allowed. If no path matches → 404 Not Found.
  4. Parameter access — Inside a handler, call xHttpRequestParam(req, "id", &len) to retrieve the captured value.

Response Serialization

When xHttpResponseSend() is called:

  1. Status line (HTTP/1.1 <code> <reason>\r\n) is written to the xIOBuffer.
  2. Content-Length header is added automatically.
  3. Connection: keep-alive or Connection: close is added based on the parser's determination.
  4. User-set headers are appended.
  5. Header section is terminated with \r\n.
  6. Body is appended.
  7. conn_try_flush() attempts an immediate writev(). If EAGAIN, the socket is registered for write events and flushing continues asynchronously.

Keep-Alive & Pipelining

  • HTTP/1.1 connections default to keep-alive. After a response is fully flushed, proto.reset() is called and the connection waits for the next request.
  • The parser is paused in on_message_complete to prevent parsing the next pipelined request before the current response is sent.
  • Error responses always set Connection: close.

HTTP/2 Support (h2c Prior Knowledge)

The server supports cleartext HTTP/2 (h2c) via the Prior Knowledge mechanism. HTTP/1.1 and HTTP/2 coexist on the same port — no TLS or Upgrade header required.

Protocol Detection

When a new connection is accepted, protocol detection is deferred until the first bytes arrive:

  1. If the first 24 bytes match the HTTP/2 connection preface (PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n), xHttpProtoH2Init() is called.
  2. If the prefix doesn't match, xHttpProtoH1Init() is called.
  3. If fewer than 24 bytes have arrived but the prefix still matches so far, the server waits for more data before deciding.

Stream Multiplexing

Under HTTP/2, a single TCP connection carries multiple concurrent streams, each representing an independent request/response exchange:

  • xHttpStream_ — Per-request state (URL, headers, body, response writer). HTTP/1.1 uses a single implicit stream (stream_id = 0); HTTP/2 creates a new stream for each request.
  • Deferred dispatch — Completed streams are queued during nghttp2_session_mem_recv() and dispatched after it returns, avoiding re-entrancy issues.
  • Response framing — Responses are submitted via nghttp2_submit_response() with HPACK-compressed headers and DATA frames, then flushed through the connection's write buffer.

H2 Connection Lifecycle

sequenceDiagram
    participant Client
    participant Conn as xHttpConn_
    participant Detect as Protocol Detection
    participant H2 as proto_h2 (nghttp2)
    participant Stream as xHttpStream_
    participant Router as Route Table
    participant Handler as User Handler

    Client->>Conn: TCP connect
    Client->>Conn: H2 connection preface + SETTINGS
    Conn->>Detect: First bytes inspection
    Detect->>H2: xHttpProtoH2Init()
    H2->>Client: SETTINGS frame (server preface)
    Client->>Conn: HEADERS frame (stream 1, :method=GET, :path=/hello)
    Conn->>H2: h2_on_data()
    H2->>Stream: Create stream (id=1)
    H2->>Stream: Accumulate headers
    H2->>Router: Dispatch (END_STREAM received)
    Router->>Handler: handler(writer, req, arg)
    Handler->>Stream: xHttpResponseSend(body)
    Stream->>H2: nghttp2_submit_response()
    H2->>Client: HEADERS + DATA frames

Key Differences: H1 vs H2

FeatureHTTP/1.1 (proto_h1)HTTP/2 (proto_h2)
Parserllhttp (byte stream → request)nghttp2 (byte stream → frame → stream)
MultiplexingNone (pipelining at best)Native, multiple concurrent streams
HeadersPlain text Key: ValueHPACK compressed pseudo-headers + regular headers
Keep-aliveConnection: keep-alive headerAlways persistent (multiplexed)
ResetPer-request proto.reset()No-op (streams are independent)
Response framingRaw HTTP/1.1 status line + headers + bodynghttp2_submit_response() → HEADERS + DATA frames
Flow controlNoneBuilt-in per-stream flow control

Limitations

  • h2 over TLS — TLS-based HTTP/2 (h2 with ALPN) is supported via xHttpServerListenTls(). Cleartext h2c uses Prior Knowledge.
  • No server push — HTTP/2 server push is not implemented.
  • Streaming responsesxHttpResponseWrite()/xHttpResponseEnd() for HTTP/2 streaming DATA frames is not yet fully implemented.

Idle Timeout

Each connection has an idle timeout (default 60 s). If no data is received within this period, the connection is closed automatically via xEvent_Timeout. The timeout is reset after each response is sent on a keep-alive connection.

API Reference

Types

TypeDescription
xHttpServerOpaque handle to an HTTP server bound to an event loop
xHttpResponseWriterOpaque handle to a response writer (valid only during handler)
xHttpRequestRequest data delivered to the handler callback
xHttpHandlerFuncvoid (*)(xHttpResponseWriter writer, const xHttpRequest *req, void *arg)
xTlsConfTLS configuration for HTTPS listeners (cert, key, CA, skip_verify)

xHttpRequest Fields

FieldTypeDescription
methodconst char *HTTP method string (e.g. "GET", "POST")
urlconst char *Request URL / path (NUL-terminated)
headersconst char *Raw request headers (NUL-terminated)
headers_lensize_tLength of headers in bytes
bodyconst char *Request body, or NULL if no body
body_lensize_tLength of body in bytes

All pointers are valid only for the duration of the handler callback.

Lifecycle

FunctionSignatureDescription
xHttpServerCreatexHttpServer xHttpServerCreate(xEventLoop loop)Create a server bound to an event loop.
xHttpServerListenxErrno xHttpServerListen(xHttpServer server, const char *host, uint16_t port)Start listening on the given address and port.
xHttpServerListenTlsxErrno xHttpServerListenTls(xHttpServer server, const char *host, uint16_t port, const xTlsConf *config)Start listening for HTTPS connections with TLS. ALPN selects H1/H2. Can coexist with Listen on a different port. Returns xErrno_NotSupported if no TLS backend was compiled.
xHttpServerDestroyvoid xHttpServerDestroy(xHttpServer server)Destroy server, close all connections, free all routes.

Route Registration

FunctionSignatureDescription
xHttpServerRoutexErrno xHttpServerRoute(xHttpServer server, const char *pattern, xHttpHandlerFunc handler, void *arg)Register a route. pattern combines method and path: "GET /users/:id" matches only GET; "/users/:id" matches all methods. Path supports :param segments. First match wins.

Request Parameters

FunctionSignatureDescription
xHttpRequestParamconst char *xHttpRequestParam(const xHttpRequest *req, const char *name, size_t *len)Look up a path parameter by name. Returns a pointer to the value (NOT NUL-terminated) and sets *len, or returns NULL if not found.

Response

FunctionSignatureDescription
xHttpResponseSetStatusvoid xHttpResponseSetStatus(xHttpResponseWriter writer, int code)Set HTTP status code (default 200).
xHttpResponseSetHeaderxErrno xHttpResponseSetHeader(xHttpResponseWriter writer, const char *key, const char *value)Add a response header. Call before Send or the first Write.
xHttpResponseSendxErrno xHttpResponseSend(xHttpResponseWriter writer, const char *body, size_t body_len)Send a complete response. May only be called once. Mutually exclusive with Write.
xHttpResponseWritexErrno xHttpResponseWrite(xHttpResponseWriter writer, const char *data, size_t len)Write data to a streaming response. First call flushes headers (no Content-Length). Mutually exclusive with Send.
xHttpResponseEndvoid xHttpResponseEnd(xHttpResponseWriter writer)End a streaming response. Optional — auto-called when the handler returns.

Configuration

FunctionSignatureDescriptionDefault
xHttpServerSetIdleTimeoutxErrno xHttpServerSetIdleTimeout(xHttpServer server, int timeout_ms)Set idle timeout for connections.60000 ms
xHttpServerSetMaxHeaderSizexErrno xHttpServerSetMaxHeaderSize(xHttpServer server, size_t max_size)Set max header size. Exceeding → 431.8192 bytes
xHttpServerSetMaxBodySizexErrno xHttpServerSetMaxBodySize(xHttpServer server, size_t max_size)Set max body size. Exceeding → 413.1048576 bytes

All configuration functions must be called before xHttpServerListen() / xHttpServerListenTls().

TLS Configuration

xTlsConf Fields (Server)

FieldTypeDescription
certconst char *Path to PEM certificate file (required).
keyconst char *Path to PEM private key file (required).
caconst char *Path to CA certificate file for client verification (optional).
skip_verifyintIf non-zero, skip peer verification. Default 0 (verify enabled).

When ca is set and skip_verify is 0 (default), the server performs mutual TLS (mTLS) — clients must present a valid certificate signed by the specified CA.

Usage Examples

Minimal Server

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "text/plain");
    xHttpResponseSend(w, "Hello, World!\n", 14);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /hello", on_hello, NULL);
    xHttpServerListen(server, "0.0.0.0", 8080);

    printf("Listening on :8080\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

JSON API with POST

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_echo(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "application/json");
    xHttpResponseSend(w, req->body, req->body_len);
}

static void on_not_found(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    const char *body = "{\"error\": \"not found\"}";
    xHttpResponseSetStatus(w, 404);
    xHttpResponseSetHeader(w, "Content-Type", "application/json");
    xHttpResponseSend(w, body, strlen(body));
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerSetMaxBodySize(server, 4 * 1024 * 1024); /* 4 MiB */

    xHttpServerRoute(server, "POST /echo", on_echo, NULL);

    xHttpServerListen(server, NULL, 9090);
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

Server-Sent Events (SSE)

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_events(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "text/event-stream");
    xHttpResponseSetHeader(w, "Cache-Control", "no-cache");

    xHttpResponseWrite(w, "data: hello\n\n", 13);
    xHttpResponseWrite(w, "data: world\n\n", 13);
    /* xHttpResponseEnd(w) is optional; auto-called on return */
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /events", on_events, NULL);

    xHttpServerListen(server, NULL, 8080);
    printf("SSE server on :8080/events\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

RESTful API with Path Parameters

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_get_user(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)arg;
    size_t id_len = 0;
    const char *id = xHttpRequestParam(req, "id", &id_len);

    char body[128];
    int len = snprintf(body, sizeof(body),
                       "{\"user_id\": \"%.*s\"}\n", (int)id_len, id);

    xHttpResponseSetHeader(w, "Content-Type", "application/json");
    xHttpResponseSend(w, body, (size_t)len);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /users/:id", on_get_user, NULL);

    xHttpServerListen(server, NULL, 8080);
    printf("REST API on :8080\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

HTTPS Server

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "text/plain");
    xHttpResponseSend(w, "Hello, HTTPS!\n", 14);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /hello", on_hello, NULL);

    // TLS configuration
    xTlsConf tls = {
        .cert = "/path/to/server.pem",
        .key  = "/path/to/server-key.pem",
    };
    xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

    printf("HTTPS server on :8443\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

HTTPS Server with Mutual TLS (mTLS)

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_secure(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "text/plain");
    xHttpResponseSend(w, "mTLS verified!\n", 15);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /secure", on_secure, NULL);

    // Require client certificates
    xTlsConf tls = {
        .cert     = "/path/to/server.pem",
        .key      = "/path/to/server-key.pem",
        .ca       = "/path/to/ca.pem",
    };
    xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

    printf("mTLS server on :8443\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

HTTP + HTTPS on Different Ports

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_hello(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSend(w, "Hello!\n", 7);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /hello", on_hello, NULL);

    // Serve HTTP on port 8080
    xHttpServerListen(server, "0.0.0.0", 8080);

    // Serve HTTPS on port 8443
    xTlsConf tls = {
        .cert = "/path/to/server.pem",
        .key  = "/path/to/server-key.pem",
    };
    xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

    printf("HTTP on :8080, HTTPS on :8443\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

Multiple Routes with Shared State

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/server.h>

typedef struct {
    int counter;
} AppState;

static void on_count(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req;
    AppState *state = (AppState *)arg;
    state->counter++;

    char body[64];
    int len = snprintf(body, sizeof(body), "{\"count\": %d}\n", state->counter);

    xHttpResponseSetHeader(w, "Content-Type", "application/json");
    xHttpResponseSend(w, body, (size_t)len);
}

static void on_health(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSend(w, "ok\n", 3);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    AppState state = { .counter = 0 };

    xHttpServerRoute(server, "POST /count", on_count, &state);
    xHttpServerRoute(server, "GET /health", on_health, NULL);

    xHttpServerListen(server, NULL, 8080);
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

Best Practices

  • Don't block in handlers. Handlers run on the event loop thread. Blocking delays all other connections.
  • Always call xHttpResponseSend() or xHttpResponseWrite(). If the handler returns without sending, a default 200 OK with empty body is sent automatically — but it's better to be explicit.
  • Don't mix Send and Write. xHttpResponseSend() is for one-shot responses; xHttpResponseWrite() is for streaming. They are mutually exclusive — calling one after the other returns xErrno_InvalidState.
  • Configure limits before listening. SetIdleTimeout, SetMaxHeaderSize, and SetMaxBodySize must be called before xHttpServerListen() / xHttpServerListenTls().
  • Register routes before listening. Routes should be set up before the server starts accepting connections.
  • Use xHttpServerListenTls() for HTTPS. Provide valid PEM certificate and key files. For mTLS, set ca (verification is enabled by default).
  • Serve HTTP and HTTPS on different ports. Call both xHttpServerListen() and xHttpServerListenTls() on the same server instance to support both protocols simultaneously.
  • Destroy server before event loop. xHttpServerDestroy() closes all connections and frees all resources.
  • Copy data you need to keep. xHttpRequest pointers (url, headers, body) are only valid during the handler callback.

Comparison with Other Libraries

Featurexhttp server.hlibuv + http-parserlibmicrohttpdGo net/httpNode.js http
I/O ModelAsync (event loop)Async (event loop)Threaded / selectGoroutinesAsync (event loop)
Event LoopxEventLoop integrationlibuvInternalGo runtimelibuv (V8)
HTTP Parserllhttp (H1) + nghttp2 (H2)http-parser / llhttpInternalInternalllhttp
Streaming ResponseBuilt-in (Write/End)ManualManualBuilt-in (Flusher)Built-in (write/end)
RoutingBuilt-in (first match)None (manual)None (manual)Built-in (ServeMux)None (manual)
Keep-AliveAutomaticManualAutomaticAutomaticAutomatic
Thread ModelSingle-threadedSingle-threadedMulti-threadedMulti-goroutineSingle-threaded
TLS/HTTPSBuilt-in (ListenTLS, mTLS)Manual (libuv + OpenSSL)Built-inBuilt-in (ListenAndServeTLS)Built-in (https.createServer)
LanguageC99CCGoJavaScript

Key Differentiator: xhttp server provides a complete, single-threaded HTTP/1.1 & HTTP/2 server with built-in routing, streaming responses, TLS/HTTPS, and automatic keep-alive — all integrated with xEventLoop. HTTP/1.1 and HTTP/2 coexist on the same port via automatic protocol detection (Prior Knowledge for cleartext, ALPN for TLS). Unlike libuv + http-parser (which requires manual response assembly and TLS integration) or libmicrohttpd (which uses threads), xhttp keeps everything on one thread with zero synchronization overhead. The TLS layer supports mutual TLS (mTLS) with client certificate verification, and the streaming API (xHttpResponseWrite/xHttpResponseEnd) makes it straightforward to implement SSE or chunked streaming without external dependencies.

Relationship with Other Modules

  • xbase — Uses xEventLoop for I/O multiplexing, xSocket for non-blocking socket management, and socket timeouts for idle connection detection.
  • xbuf — Uses xBuffer for request parsing accumulation (URL, headers, body) and xIOBuffer for read/write buffering with scatter-gather I/O.
  • llhttp — External dependency. Provides incremental HTTP/1.1 request parsing via callbacks, isolated behind the xHttpProto vtable in proto_h1.c.
  • nghttp2 — External dependency. Provides HTTP/2 frame processing, HPACK header compression, and stream management, isolated behind the xHttpProto vtable in proto_h2.c.
  • OpenSSL / Mbed TLS — External dependency (TLS backend, compile-time selection via XK_TLS_BACKEND). Provides TLS handshake, encryption, certificate verification, and ALPN negotiation for xHttpServerListenTls().

ws.h — WebSocket Server

Introduction

ws.h provides a callback-driven WebSocket interface integrated with the xhttp server. For pure WebSocket services, call xWsServe() to create a server in one line. For mixed HTTP + WebSocket endpoints, call xWsUpgrade() inside a regular HTTP handler to perform the RFC 6455 upgrade handshake. The library handles frame codec, ping/pong, fragment reassembly, and close negotiation automatically.

All callbacks are dispatched on the event loop thread — no locks or thread pools required.

Design Philosophy

  1. Handler-Initiated Upgrade — WebSocket connections start as regular HTTP requests. The user calls xWsUpgrade() inside an xHttpHandlerFunc to perform the upgrade. This keeps routing unified: WebSocket endpoints are just HTTP routes.

  2. Callback-Driven I/O — Three optional callbacks (on_open, on_message, on_close) cover the full connection lifecycle. The library handles all framing, masking, and control frames internally.

  3. Automatic Protocol Handling — Ping/pong is answered automatically. Fragmented messages are reassembled before delivery. Close handshake follows RFC 6455 §5.5.1 with a 5-second timeout for the peer's response.

  4. Connection Hijacking — On successful upgrade, the HTTP connection's socket and transport layer are transferred to a new xWsConn object. The HTTP connection is destroyed; the WebSocket connection takes full ownership of the file descriptor.

  5. Pluggable Crypto Backend — The handshake requires SHA-1 and Base64 for Sec-WebSocket-Accept computation. The crypto backend is selected at compile time: OpenSSL, Mbed TLS, or a built-in implementation.

Architecture

graph TD
    subgraph "Application"
        APP["User Code"]
        HANDLER["HTTP Handler"]
        WS_CBS["xWsCallbacks"]
    end

    subgraph "xhttp WebSocket"
        UPGRADE["xWsUpgrade()"]
        HANDSHAKE["Handshake<br/>(RFC 6455 §4)"]
        CRYPTO["SHA-1 + Base64<br/>(pluggable backend)"]
        WSCONN["xWsConn"]
        PARSER["Frame Parser<br/>(incremental)"]
        ENCODER["Frame Encoder"]
        FRAG["Fragment<br/>Reassembly"]
        CTRL["Control Frames<br/>(Ping/Pong/Close)"]
    end

    subgraph "xbase"
        LOOP["xEventLoop"]
        SOCK["xSocket"]
        TIMER["Idle Timer"]
    end

    APP -->|"xHttpServerRoute"| HANDLER
    HANDLER -->|"xWsUpgrade(w, req, cbs)"| UPGRADE
    UPGRADE --> HANDSHAKE
    HANDSHAKE --> CRYPTO
    HANDSHAKE -->|"101 Switching Protocols"| WSCONN
    WSCONN --> PARSER
    WSCONN --> ENCODER
    PARSER --> FRAG
    PARSER --> CTRL
    FRAG -->|"on_message"| WS_CBS
    CTRL -->|"auto pong"| ENCODER
    WSCONN --> SOCK
    SOCK --> LOOP
    TIMER --> LOOP

    style WSCONN fill:#4a90d9,color:#fff
    style LOOP fill:#50b86c,color:#fff
    style PARSER fill:#9b59b6,color:#fff
    style HANDSHAKE fill:#f5a623,color:#fff

Implementation Details

Upgrade Handshake Flow

sequenceDiagram
    participant Client as Browser
    participant Handler as HTTP Handler
    participant Upgrade as xWsUpgrade()
    participant Conn as xHttpConn_
    participant WS as xWsConn

    Client->>Handler: GET /ws (Upgrade: websocket)
    Handler->>Upgrade: xWsUpgrade(w, req, &cbs, arg)
    Upgrade->>Upgrade: Validate headers
    Note over Upgrade: Method=GET<br/>Upgrade: websocket<br/>Connection: Upgrade<br/>Sec-WebSocket-Version: 13<br/>Sec-WebSocket-Key: ...
    Upgrade->>Upgrade: SHA1(Key + GUID) → Base64
    Upgrade->>Client: 101 Switching Protocols
    Upgrade->>Conn: Hijack socket + transport
    Upgrade->>WS: xWsConnCreate()
    WS->>Client: on_open callback fires

Connection Lifecycle

stateDiagram-v2
    [*] --> Open: xWsUpgrade() succeeds
    Open --> Open: Data frames (text/binary)
    Open --> Open: Ping → auto Pong
    Open --> CloseSent: xWsClose() called
    Open --> CloseReceived: Peer sends Close
    CloseSent --> Closed: Peer Close received
    CloseSent --> Closed: 5s timeout
    CloseReceived --> Closed: Echo Close flushed
    Open --> Closed: I/O error
    Open --> CloseSent: Idle timeout (1001)
    Closed --> [*]: on_close + destroy

Frame Processing

When data arrives on the socket, the incremental frame parser (xWsFrameParser) extracts complete frames from the xIOBuffer. Each frame is processed based on its opcode:

OpcodeHandling
Text (0x1)Deliver via on_message
Binary (0x2)Deliver via on_message
Continuation (0x0)Append to fragment buffer
Ping (0x9)Auto-reply with Pong
Pong (0xA)Ignored
Close (0x8)Close handshake

Fragment Reassembly

Fragmented messages are reassembled transparently:

  1. First fragment (FIN=0, opcode=Text/Binary) starts accumulation in frag_buf.
  2. Continuation frames (opcode=0x0) append to frag_buf.
  3. Final fragment (FIN=1, opcode=0x0) triggers reassembly and delivers the complete message via on_message.

Protocol violations (e.g., new message mid-fragment) result in a Close frame with status 1002.

Close State Machine

XDEF_ENUM(xWsCloseState){
    xWsCloseState_Open,          // Normal operating state
    xWsCloseState_CloseSent,     // We sent Close, waiting for peer
    xWsCloseState_CloseReceived, // Peer sent Close, we replied
    xWsCloseState_Closed,        // Connection fully closed
};
  • Server-initiated close: xWsClose() sends a Close frame and transitions to CLOSE_SENT. A 5-second timer waits for the peer's Close response.
  • Peer-initiated close: The peer's Close frame is echoed back, transitioning to CLOSE_RECEIVED. After the echo is flushed, on_close fires and the connection is destroyed.
  • Idle timeout: After the configured idle period with no data, a Close frame with code 1001 (Going Away) is sent.

Internal File Structure

FileRole
ws.hPublic API (types, callbacks, functions)
ws.cConnection lifecycle, I/O, frame dispatch
ws_handshake_server.cServer upgrade handshake (RFC 6455 §4.2)
ws_frame.h/cFrame codec (parse + encode)
ws_crypto.hSHA-1 + Base64 interface
ws_crypto_openssl.cOpenSSL backend
ws_crypto_mbedtls.cMbed TLS backend
ws_crypto_builtin.cBuilt-in (no TLS dep)
ws_serve.cxWsServe() convenience wrapper
ws_private.hInternal data structures

API Reference

Types

TypeDescription
xWsConnOpaque WebSocket connection handle
xWsOpcodeMessage type: Text (0x1), Binary (0x2)
xWsCallbacksStruct of 3 optional callback pointers

Callback Signatures

xWsOnOpenFunc

typedef void (*xWsOnOpenFunc)(xWsConn conn, void *arg);

Called when the WebSocket connection is established. conn is valid until on_close returns.

xWsOnMessageFunc

typedef void (*xWsOnMessageFunc)(
    xWsConn conn, xWsOpcode opcode,
    const void *payload, size_t len,
    void *arg);

Called when a complete message is received. Fragmented messages are reassembled before delivery. payload is valid only during the callback.

xWsOnCloseFunc

typedef void (*xWsOnCloseFunc)(
    xWsConn conn, uint16_t code,
    const char *reason, size_t len,
    void *arg);

Called when the connection is closed (clean or abnormal). After this callback returns, conn is invalid.

xWsCallbacks

typedef struct {
    xWsOnOpenFunc    on_open;    // optional
    xWsOnMessageFunc on_message; // optional
    xWsOnCloseFunc   on_close;   // optional
} xWsCallbacks;

Functions

FunctionDescription
xWsServeOne-call WebSocket-only server
xWsUpgradeUpgrade HTTP → WebSocket
xWsSendSend a text or binary message
xWsCloseInitiate graceful close

xWsServe

xHttpServer xWsServe(
    xEventLoop loop,
    const char *host,
    uint16_t port,
    const xWsCallbacks *callbacks,
    void *arg);

Convenience function that creates an HTTP server, registers a catch-all route that upgrades every incoming request to WebSocket, and starts listening. Returns the server handle for later cleanup via xHttpServerDestroy(), or NULL on failure.

Parameters:

  • loop — Event loop (must not be NULL).
  • host — Bind address (e.g. "0.0.0.0"), or NULL.
  • port — Port number to listen on.
  • callbacks — WebSocket event callbacks (not NULL).
  • arg — User argument forwarded to all callbacks.

Returns: Server handle, or NULL on failure.

xWsUpgrade

xErrno xWsUpgrade(
    xHttpResponseWriter writer,
    const xHttpRequest *req,
    const xWsCallbacks *callbacks,
    void *arg);

Call inside an xHttpHandlerFunc to upgrade the HTTP connection to WebSocket. On success, the handler must return immediately — the HTTP connection has been hijacked.

On failure (bad headers, wrong method), an HTTP error response (400/405) is sent automatically and a non-Ok error code is returned.

Parameters:

  • writer — Response writer from the handler.
  • req — HTTP request from the handler.
  • callbacks — WebSocket event callbacks (not NULL).
  • arg — User argument forwarded to all callbacks.

Returns: xErrno_Ok on success.

xWsSend

xErrno xWsSend(
    xWsConn conn, xWsOpcode opcode,
    const void *payload, size_t len);

Send a message over the WebSocket connection. The payload is framed and queued for asynchronous transmission.

Parameters:

  • conn — WebSocket connection handle.
  • opcodexWsOpcode_Text or xWsOpcode_Binary.
  • payload — Message data.
  • len — Payload length in bytes.

Returns: xErrno_Ok on success, xErrno_InvalidState if the connection is closing.

xWsClose

xErrno xWsClose(xWsConn conn, uint16_t code);

Initiate a graceful close. Sends a Close frame with the given status code. The connection remains open until the peer responds or a 5-second timeout expires.

Parameters:

  • conn — WebSocket connection handle.
  • code — Close status code (e.g., 1000 for normal).

Returns: xErrno_Ok on success.

Close Status Codes

CodeConstantMeaning
1000XWS_CLOSE_NORMALNormal closure
1001XWS_CLOSE_GOING_AWAYServer shutting down
1002XWS_CLOSE_PROTOCOL_ERRProtocol error
1003XWS_CLOSE_UNSUPPORTEDUnsupported data
1005XWS_CLOSE_NO_STATUSNo status received
1006XWS_CLOSE_ABNORMALAbnormal closure

Usage Examples

Echo Server (with xWsServe)

#include <xbase/event.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>

static void on_open(xWsConn conn, void *arg) {
    (void)arg;
    const char *hi = "Welcome!";
    xWsSend(conn, xWsOpcode_Text, hi, strlen(hi));
}

static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
    (void)arg;
    xWsSend(conn, op, data, len);
}

static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
    (void)conn; (void)reason; (void)len; (void)arg;
    printf("closed: %u\n", code);
}

static const xWsCallbacks ws_cbs = {
    .on_open    = on_open,
    .on_message = on_message,
    .on_close   = on_close,
};

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xHttpServer srv = xWsServe(loop, "0.0.0.0", 8080, &ws_cbs, NULL);
    if (!srv) return 1;

    printf("ws://localhost:8080/\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(srv);
    xEventLoopDestroy(loop);
    return 0;
}

Echo Server (with xWsUpgrade)

#include <xbase/event.h>
#include <xhttp/server.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>

static const xWsCallbacks ws_cbs = { ... };

static void ws_handler(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)arg;
    xWsUpgrade(w, req, &ws_cbs, NULL);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer srv = xHttpServerCreate(loop);

    xHttpServerRoute(srv, "GET /ws", ws_handler, NULL);
    xHttpServerListen(srv, "0.0.0.0", 8080);

    printf("ws://localhost:8080/ws\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(srv);
    xEventLoopDestroy(loop);
    return 0;
}

Per-Connection User Data

typedef struct {
    char username[64];
    int  msg_count;
} Session;

static void on_open(xWsConn conn, void *arg) {
    Session *s = (Session *)arg;
    snprintf(s->username, sizeof(s->username), "user_%p", (void *)conn);
    s->msg_count = 0;
}

static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
    Session *s = (Session *)arg;
    s->msg_count++;
    printf("[%s] msg #%d: %.*s\n", s->username, s->msg_count, (int)len, (const char *)data);
    xWsSend(conn, op, data, len);
}

static void ws_handler(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)arg;
    Session *s = calloc(1, sizeof(Session));
    xWsCallbacks cbs = {
        .on_open    = on_open,
        .on_message = on_message,
        .on_close   = on_close_free_session,
    };
    xWsUpgrade(w, req, &cbs, s);
}

Graceful Server-Initiated Close

static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
    (void)op; (void)arg;
    if (len == 4 && memcmp(data, "quit", 4) == 0) {
        xWsClose(conn, 1000); // normal close
        return;
    }
    xWsSend(conn, op, data, len);
}

JavaScript Client

<script>
const ws = new WebSocket('ws://localhost:8080/ws');

ws.onopen = () => console.log('connected');

ws.onmessage = (e) => console.log('< ' + e.data);

ws.onclose = (e) =>
    console.log('closed: ' + e.code);

// Send a message
ws.send('Hello, server!');
</script>

Best Practices

  • Return immediately after xWsUpgrade(). On success, the HTTP connection is hijacked. Do not call any xHttpResponse* functions afterward.
  • Don't block in callbacks. All callbacks run on the event loop thread. Blocking delays all other I/O.
  • Copy payload if needed. The payload pointer in on_message is valid only during the callback. Copy the data if you need it later.
  • Use xWsClose() for graceful shutdown. Avoid dropping connections without a Close handshake.
  • Handle on_close for cleanup. Free per-connection resources in on_close, as the xWsConn handle becomes invalid after the callback returns.
  • Idle timeout is inherited. The WebSocket connection inherits the HTTP server's idle_timeout_ms setting. Adjust it via xHttpServerSetIdleTimeout() if needed.

Comparison with Other Libraries

Featurexhttp WSlibwebsocketsuWebSockets
IntegrationxEventLoopOwn loopOwn loop
UpgradeIn HTTP handlerSeparateSeparate
Fragment reassemblyAutomaticAutomaticAutomatic
Ping/PongAutomaticAutomaticAutomatic
Close handshakeRFC 6455RFC 6455RFC 6455
TLSVia xhttpBuilt-inBuilt-in
LanguageC99CC++
Dependenciesxbase onlyOpenSSLNone

Key Differentiator: xhttp's WebSocket server is unique in its handler-initiated upgrade pattern. Instead of a separate WebSocket server, you register a normal HTTP route and call xWsUpgrade() inside the handler. This keeps routing, middleware, and mixed HTTP+WS endpoints unified under a single server instance.

ws.h — WebSocket Client

Introduction

ws.h provides xWsConnect(), an asynchronous WebSocket client that integrates with xbase's event loop. The entire connection process — DNS resolution, TCP connect, optional TLS handshake, and HTTP Upgrade — runs fully asynchronously. Once connected, the same callback-driven model (on_open, on_message, on_close) and the same xWsConn handle are used for both client and server connections.

Design Philosophy

  1. Fully Asynchronous ConnectionxWsConnect() returns immediately. The multi-phase connection process (DNS → TCP → TLS → HTTP Upgrade) is driven entirely by the event loop. No threads or blocking calls.

  2. Shared Connection Model — Once the handshake completes, a client xWsConn is identical to a server xWsConn. The same xWsSend(), xWsClose(), and callback interfaces apply. Code that operates on xWsConn doesn't need to know which side initiated the connection.

  3. Failure via on_close — If the connection fails at any stage (DNS, TCP, TLS, or HTTP Upgrade), on_close is invoked with an error code. on_open is never called for failed connections. This simplifies error handling: cleanup always happens in one place.

  4. Client-Side Masking — Per RFC 6455, client-to-server frames must be masked. The library handles this automatically when the connection is created in client mode.

Architecture

graph TD
    subgraph "Application"
        APP["User Code"]
        CBS["xWsCallbacks"]
        CONF["xWsConnectConf"]
    end

    subgraph "xWsConnect State Machine"
        CONNECT["xWsConnect()"]
        DNS["DNS Resolution"]
        TCP["TCP Connect"]
        TLS["TLS Handshake<br/>(wss:// only)"]
        UPGRADE["HTTP Upgrade<br/>Request/Response"]
        VALIDATE["Validate 101<br/>+ Sec-WebSocket-Accept"]
    end

    subgraph "Established Connection"
        WSCONN["xWsConn<br/>(client mode)"]
        SEND["xWsSend()"]
        CLOSE["xWsClose()"]
    end

    subgraph "xbase"
        LOOP["xEventLoop"]
        SOCK["xSocket"]
        TIMER["Timeout Timer"]
    end

    APP --> CONF
    APP --> CBS
    CONF --> CONNECT
    CBS --> CONNECT
    CONNECT --> DNS
    DNS --> TCP
    TCP --> TLS
    TLS --> UPGRADE
    UPGRADE --> VALIDATE
    VALIDATE -->|"Success"| WSCONN
    VALIDATE -->|"Failure"| CBS

    WSCONN --> SEND
    WSCONN --> CLOSE
    WSCONN --> SOCK
    SOCK --> LOOP
    TIMER --> LOOP

    style WSCONN fill:#4a90d9,color:#fff
    style LOOP fill:#50b86c,color:#fff
    style CONNECT fill:#f5a623,color:#fff
    style VALIDATE fill:#9b59b6,color:#fff

Implementation Details

Connection State Machine

The xWsConnector drives the connection through five phases, all on the event loop thread:

stateDiagram-v2
    [*] --> DNS: xWsConnect() called
    DNS --> TCP_CONNECT: Address resolved
    TCP_CONNECT --> TLS_HANDSHAKE: Connected [wss]
    TCP_CONNECT --> HTTP_UPGRADE_WRITE: Connected [ws]
    TLS_HANDSHAKE --> HTTP_UPGRADE_WRITE: Handshake complete
    HTTP_UPGRADE_WRITE --> HTTP_UPGRADE_READ: Request sent
    HTTP_UPGRADE_READ --> DONE: 101 validated
    DONE --> [*]: on_open fires

    DNS --> [*]: Failure → on_close
    TCP_CONNECT --> [*]: Failure → on_close
    TLS_HANDSHAKE --> [*]: Failure → on_close
    HTTP_UPGRADE_READ --> [*]: Bad response → on_close
    DNS --> [*]: Timeout → on_close
    TCP_CONNECT --> [*]: Timeout → on_close

Phase Details

PhaseWhat Happens
DNSxDnsResolve() resolves the hostname asynchronously. On success, proceeds to TCP.
TCP ConnectCreates an xSocket, calls connect(). Waits for the writable event (EINPROGRESS).
TLS HandshakeFor wss:// URLs only. Initializes the TLS transport and drives the handshake via read/write events.
HTTP Upgrade WriteBuilds the Upgrade request (with random Sec-WebSocket-Key) and flushes it to the server.
HTTP Upgrade ReadReads the server's response, validates HTTP/1.1 101, Upgrade: websocket, Connection: Upgrade, and Sec-WebSocket-Accept.

Handshake Flow

sequenceDiagram
    participant App as Application
    participant Conn as xWsConnector
    participant DNS as xDnsResolve
    participant Server as Remote Server

    App->>Conn: xWsConnect(loop, conf, cbs, arg)
    Conn->>DNS: Resolve hostname
    DNS-->>Conn: Address resolved
    Conn->>Server: TCP connect()
    Server-->>Conn: Connected
    Note over Conn,Server: (wss:// only) TLS handshake
    Conn->>Server: GET /path HTTP/1.1<br/>Upgrade: websocket<br/>Sec-WebSocket-Key: ...
    Server-->>Conn: HTTP/1.1 101 Switching Protocols<br/>Sec-WebSocket-Accept: ...
    Conn->>Conn: Validate response
    Conn->>App: on_open(conn, arg)

Timeout Handling

A configurable timeout (default 10 seconds) covers the entire connection process. If any phase takes too long, the timer fires, the connector is destroyed, and on_close is invoked with code 1006 (Abnormal Closure).

Internal File Structure

FileRole
ws.hPublic API (xWsConnect, xWsConnectConf)
ws_connect.cAsync connection state machine
ws_handshake_client.h/cBuild Upgrade request, validate 101 response
ws_crypto.hSHA-1 + Base64 for Sec-WebSocket-Accept
transport_tls_client.hTLS client transport init (shared xTlsCtx → per-connection SSL)
transport_tls_client_openssl.cOpenSSL TLS client transport implementation
transport_tls_client_mbedtls.cmbedTLS TLS client transport implementation

API Reference

Types

TypeDescription
xWsConnOpaque WebSocket connection handle (shared with server)
xWsOpcodeMessage type: Text (0x1), Binary (0x2)
xWsCallbacksStruct of 3 optional callback pointers (shared with server)
xWsConnectConfConfiguration for xWsConnect()

xWsConnectConf

struct xWsConnectConf {
    const char *url;              // ws:// or wss:// URL (required)
    const xTlsConf *tls;         // TLS config for wss:// (NULL = defaults)
    xTlsCtx tls_ctx;             // Pre-created shared TLS context (priority over tls)
    const char *headers;          // Extra HTTP headers (NULL = none)
    int timeout_ms;               // Connect timeout (0 = 10000 ms)
};
FieldDescription
urlWebSocket URL. Must start with ws:// or wss://. Required.
tlsTLS configuration for wss:// connections. NULL uses system CA with verification enabled. Ignored for ws://. Ignored when tls_ctx is set.
tls_ctxPre-created shared TLS context from xTlsCtxCreate(). Takes priority over tls. The caller retains ownership and must keep it alive for the lifetime of the connection. NULL = create from tls (or use defaults).
headersExtra HTTP headers appended to the Upgrade request. Format: "Key: Value\r\nKey2: Value2\r\n". NULL for none.
timeout_msTimeout for the entire connection process in milliseconds. 0 uses the default (10000 ms).

Callbacks

The same xWsCallbacks struct is used for both client and server connections. See WebSocket Server for callback signature details.

Client-specific behavior:

  • on_open — Called when the connection is fully established (101 validated). Not called on failure.
  • on_close — Called on connection failure (DNS, TCP, TLS, or Upgrade error) or after a normal close. For failed connections, conn is NULL.

Functions

xWsConnect

xErrno xWsConnect(
    xEventLoop loop,
    const xWsConnectConf *conf,
    const xWsCallbacks *callbacks,
    void *arg);

Initiate an asynchronous WebSocket client connection. Returns immediately; the connection process runs on the event loop.

Parameters:

  • loop — Event loop (must not be NULL).
  • conf — Connection configuration (must not be NULL, conf->url required).
  • callbacks — WebSocket event callbacks (must not be NULL).
  • arg — User argument forwarded to all callbacks.

Returns: xErrno_Ok if the async connection started, xErrno_InvalidArg for bad parameters (NULL pointers, invalid URL scheme).

xWsSend

xErrno xWsSend(
    xWsConn conn, xWsOpcode opcode,
    const void *payload, size_t len);

Send a message. Identical to the server-side API. Client frames are automatically masked per RFC 6455.

xWsClose

xErrno xWsClose(xWsConn conn, uint16_t code);

Initiate a graceful close. Identical to the server-side API.

Usage Examples

Connect and Echo

#include <xbase/event.h>
#include <xhttp/ws.h>
#include <stdio.h>
#include <string.h>

static void on_open(xWsConn conn, void *arg) {
    (void)arg;
    const char *msg = "Hello, server!";
    xWsSend(conn, xWsOpcode_Text, msg, strlen(msg));
}

static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) {
    (void)conn; (void)op; (void)arg;
    printf("Received: %.*s\n", (int)len, (const char *)data);
    xWsClose(conn, 1000);
}

static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
    (void)conn; (void)reason; (void)len; (void)arg;
    printf("Closed: %u\n", code);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xWsConnectConf conf = {0};
    conf.url = "ws://localhost:8080/ws";

    xWsCallbacks cbs = {
        .on_open    = on_open,
        .on_message = on_message,
        .on_close   = on_close,
    };

    xWsConnect(loop, &conf, &cbs, NULL);

    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

Secure Connection (wss://)

#include <xbase/event.h>
#include <xhttp/ws.h>
#include <xnet/tls.h>

static void on_open(xWsConn conn, void *arg) { /* ... */ }
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) { /* ... */ }
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) { /* ... */ }

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    // Skip certificate verification (dev only)
    xTlsConf tls = {0};
    tls.skip_verify = 1;

    xWsConnectConf conf = {0};
    conf.url = "wss://echo.example.com/ws";
    conf.tls = &tls;
    conf.timeout_ms = 5000;

    xWsCallbacks cbs = {
        .on_open    = on_open,
        .on_message = on_message,
        .on_close   = on_close,
    };

    xWsConnect(loop, &conf, &cbs, NULL);

    xEventLoopRun(loop);
    xEventLoopDestroy(loop);
    return 0;
}

Shared TLS Context (Multiple Connections)

When creating many wss:// connections (e.g. reconnect loops or connection pools), use a shared xTlsCtx to avoid reloading certificates on every connection:

#include <xbase/event.h>
#include <xhttp/ws.h>
#include <xnet/tls.h>

static void on_open(xWsConn conn, void *arg) { /* ... */ }
static void on_message(xWsConn conn, xWsOpcode op, const void *data, size_t len, void *arg) { /* ... */ }
static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) { /* ... */ }

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    // Create a shared TLS context once
    xTlsConf tls = {0};
    tls.ca = "ca.pem";
    xTlsCtx ctx = xTlsCtxCreate(&tls);

    // All connections share the same ctx
    xWsConnectConf conf = {0};
    conf.url = "wss://echo.example.com/ws";
    conf.tls_ctx = ctx;  // shared, not copied

    xWsCallbacks cbs = {
        .on_open    = on_open,
        .on_message = on_message,
        .on_close   = on_close,
    };

    xWsConnect(loop, &conf, &cbs, NULL);

    xEventLoopRun(loop);

    // Destroy ctx after all connections are closed
    xTlsCtxDestroy(ctx);
    xEventLoopDestroy(loop);
    return 0;
}

Custom Headers (Authentication)

xWsConnectConf conf = {0};
conf.url = "ws://api.example.com/stream";
conf.headers = "Authorization: Bearer token123\r\n"
               "X-Client-Version: 1.0\r\n";

xWsConnect(loop, &conf, &cbs, NULL);

Connection Failure Handling

static void on_close(xWsConn conn, uint16_t code, const char *reason, size_t len, void *arg) {
    if (conn == NULL) {
        // Connection failed before establishing WebSocket
        printf("Connection failed (code %u)\n", code);
        // Optionally retry after a delay
        return;
    }
    // Normal close after successful connection
    printf("Disconnected: %u\n", code);
}

Binary Data

static void on_open(xWsConn conn, void *arg) {
    uint8_t data[] = {0x00, 0x01, 0x02, 0xFF, 0xFE};
    xWsSend(conn, xWsOpcode_Binary, data, sizeof(data));
}

Best Practices

  • Check the return value of xWsConnect(). It returns xErrno_InvalidArg for obviously bad parameters (NULL pointers, unsupported URL scheme). Network errors are reported asynchronously via on_close.
  • Handle conn == NULL in on_close. This indicates a connection failure before the WebSocket was established. Use this to implement retry logic.
  • Don't block in callbacks. All callbacks run on the event loop thread.
  • Copy payload if needed. The payload pointer in on_message is valid only during the callback.
  • Use xWsClose() for graceful shutdown. The client sends a Close frame and waits for the server's response.
  • Set a reasonable timeout. The default 10-second timeout covers DNS + TCP + TLS + Upgrade. Adjust via conf.timeout_ms for high-latency networks.
  • Never use skip_verify in production. It disables all certificate validation. Use a proper CA path or system CA bundle instead.

Comparison with Other Libraries

Featurexhttp WS Clientlibwebsocketswslaycivetweb
I/O ModelAsync (event loop)Async (own loop)Sync (user drives)Threaded
Event LoopxEventLoopOwn loopNonepthreads
DNSAsync (xDnsResolve)Async (built-in)ManualBlocking
TLSVia xnetBuilt-inManualBuilt-in
Client MaskingAutomaticAutomaticAutomaticAutomatic
Connection TimeoutConfigurableConfigurableManualConfigurable
LanguageC99CCC
Dependenciesxbase + xnetOpenSSLNoneNone

Key Differentiator: xhttp's WebSocket client runs entirely on the xbase event loop with zero blocking calls. The multi-phase connection (DNS → TCP → TLS → Upgrade) is a single async state machine. Combined with the shared xWsConn model, client and server code use identical APIs for sending, receiving, and closing — making bidirectional WebSocket applications straightforward.

TLS Context Sharing: For wss:// connections, the client supports a shared xTlsCtx (via conf.tls_ctx) that avoids reloading certificates and re-creating the SSL context on every connection. This is the same pattern used by xTcpConnect and xTcpListener, providing consistent TLS context management across all xKit networking APIs.

sse.c — SSE Stream Client

Introduction

sse.c implements Server-Sent Events (SSE) support for xHttpClient. It provides xHttpClientGetSse() and xHttpClientDoSse() which subscribe to SSE endpoints and parse the event stream according to the W3C SSE specification. Each parsed event is delivered to a callback as it arrives, enabling real-time streaming — ideal for LLM API integration.

Design Philosophy

  1. W3C Spec Compliance — The parser follows the W3C Server-Sent Events specification: field parsing (event, data, id, retry), comment handling, multi-line data joining with \n, and default event type "message".

  2. Streaming Parse — Data is parsed incrementally as it arrives from libcurl's write callback. Complete lines are processed immediately; incomplete lines are buffered until more data arrives.

  3. Shared Infrastructure — SSE requests reuse the same curl_multi handle and event loop integration as regular HTTP requests. The xHttpReqVtable mechanism allows SSE to plug in its own write callback and completion handler.

  4. User-Controlled Cancellation — The xSseEventFunc callback returns an int: 0 to continue, non-zero to close the connection. This gives the user fine-grained control over when to stop streaming.

Architecture

graph TD
    subgraph "SSE Request Flow"
        SUBMIT["xHttpClientDoSse()"]
        EASY["curl_easy + SSE headers"]
        WRITE["sse_write_callback"]
        PARSER["xSseParser_"]
        EVENT["on_event(ev)"]
        DONE["on_done(curl_code)"]
    end

    subgraph "Shared with Oneshot"
        MULTI["curl_multi"]
        LOOP["xEventLoop"]
        CHECK["check_multi_info()"]
    end

    SUBMIT --> EASY
    EASY --> MULTI
    MULTI --> LOOP
    LOOP -->|"fd ready"| WRITE
    WRITE --> PARSER
    PARSER -->|"event boundary"| EVENT
    CHECK -->|"transfer done"| DONE

    style PARSER fill:#4a90d9,color:#fff
    style EVENT fill:#50b86c,color:#fff

Implementation Details

SSE Parser State Machine

stateDiagram-v2
    [*] --> Buffering: Data arrives from curl
    Buffering --> ParseLine: Complete line found (\\n or \\r\\n)
    ParseLine --> FieldParse: Non-empty line
    ParseLine --> DispatchEvent: Empty line (event boundary)
    FieldParse --> Buffering: Continue parsing
    DispatchEvent --> CallUser: data field exists
    DispatchEvent --> Buffering: No data (skip)
    CallUser --> Buffering: User returns 0 (continue)
    CallUser --> [*]: User returns non-zero (close)

SSE Field Parsing

Each non-empty line is parsed as a field:

Line FormatFieldValue
:comment(ignored)
event:typeevent_type"type"
data:payloaddata"payload" (accumulated with \n)
id:123id"123" (persists across events)
retry:5000retry5000 (ms, must be all digits)
unknown:foo(ignored)

Multi-line data: Multiple data: lines are joined with \n:

data:line1
data:line2
data:line3

→ ev.data = "line1\nline2\nline3"

Parser Internal Structure

struct xSseParser_ {
    xBuffer  buf;          // Raw incoming data buffer
    size_t   pos;          // Parse position within buf
    int      error;        // Allocation failure flag

    char *event_type;      // Current event type (NULL = "message")
    char *data;            // Accumulated data lines
    char *id;              // Last event ID (persists across events)
    int   retry;           // Retry delay in ms (-1 = not set)
};

Data Flow

sequenceDiagram
    participant Server as SSE Server
    participant Curl as libcurl
    participant Writer as sse_write_callback
    participant Parser as xSseParser_
    participant User as User Callback

    Server->>Curl: HTTP 200 text/event-stream
    loop For each chunk
        Curl->>Writer: sse_write_callback(chunk)
        Writer->>Parser: sse_parser_feed(chunk)
        Parser->>Parser: Buffer + parse lines
        alt Empty line (event boundary)
            Parser->>User: on_event(ev)
            alt User returns 0
                User->>Parser: Continue
            else User returns non-zero
                User->>Writer: Close connection
                Writer->>Curl: Return 0 (abort)
            end
        end
    end
    Curl->>User: on_done(curl_code)

SSE Request Structure

struct xSseReq_ {
    struct xHttpReq_   base;        // Base request (shared with oneshot)
    xSseEventFunc      on_event;    // Per-event callback
    xSseDoneFunc       on_done;     // Stream-end callback
    struct xSseParser_ parser;      // SSE parser state
    struct curl_slist  *sse_headers; // Accept: text/event-stream + user headers
};

The SSE request uses a dedicated vtable:

  • sse_on_done — Invokes the user's on_done callback.
  • sse_on_cleanup — Frees SSE-specific resources (parser, headers).

Automatic Headers

xHttpClientDoSse() automatically adds:

  • Accept: text/event-stream
  • Cache-Control: no-cache

User-provided headers are merged after these defaults.

API Reference

Types

TypeDescription
xSseEventSSE event: event (type), data, id, retry
xSseEventFuncint (*)(const xSseEvent *ev, void *arg) — return 0 to continue, non-zero to close
xSseDoneFuncvoid (*)(int curl_code, void *arg) — called when stream ends

xSseEvent Fields

FieldTypeDescription
eventconst char *Event type. "message" if omitted by server.
dataconst char *Event data. Multi-line data joined by \n.
idconst char *Last event ID, or NULL.
retryintRetry delay in ms, or -1 if not set.

Functions

FunctionSignatureDescriptionThread Safety
xHttpClientGetSsexErrno xHttpClientGetSse(xHttpClient client, const char *url, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg)Subscribe to SSE endpoint (GET).Not thread-safe
xHttpClientDoSsexErrno xHttpClientDoSse(xHttpClient client, const xHttpRequestConf *config, xSseEventFunc on_event, xSseDoneFunc on_done, void *arg)Fully-configured SSE request.Not thread-safe

Usage Examples

Simple SSE Subscription

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>

static int on_event(const xSseEvent *ev, void *arg) {
    (void)arg;
    printf("[%s] %s\n", ev->event, ev->data);
    return 0; // Continue receiving
}

static void on_done(int curl_code, void *arg) {
    (void)arg;
    printf("Stream ended (code=%d)\n", curl_code);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpClient client = xHttpClientCreate(loop, NULL);

    xHttpClientGetSse(client, "https://example.com/events",
                      on_event, on_done, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

LLM API Streaming (OpenAI-Compatible)

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/client.h>

static int on_event(const xSseEvent *ev, void *arg) {
    (void)arg;

    // OpenAI sends "[DONE]" as the final data
    if (strcmp(ev->data, "[DONE]") == 0) {
        printf("\n--- Stream complete ---\n");
        return 1; // Close connection
    }

    // Parse JSON and extract content delta...
    printf("%s", ev->data);
    fflush(stdout);
    return 0;
}

static void on_done(int curl_code, void *arg) {
    (void)arg;
    if (curl_code != 0)
        printf("\nStream error (code=%d)\n", curl_code);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpClient client = xHttpClientCreate(loop, NULL);

    const char *body =
        "{"
        "  \"model\": \"gpt-4\","
        "  \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}],"
        "  \"stream\": true"
        "}";

    const char *headers[] = {
        "Content-Type: application/json",
        "Authorization: Bearer sk-your-api-key",
        NULL
    };

    xHttpRequestConf config = {
        .url       = "https://api.openai.com/v1/chat/completions",
        .method    = xHttpMethod_POST,
        .body      = body,
        .body_len  = strlen(body),
        .headers   = headers,
        .timeout_ms = 60000, // 60s timeout for streaming
    };

    xHttpClientDoSse(client, &config, on_event, on_done, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

Early Cancellation

static int on_event(const xSseEvent *ev, void *arg) {
    int *count = (int *)arg;
    (*count)++;

    printf("Event #%d: %s\n", *count, ev->data);

    // Stop after 10 events
    if (*count >= 10) {
        printf("Received enough events, closing.\n");
        return 1; // Non-zero = close connection
    }
    return 0;
}

Use Cases

  1. LLM API Integration — Stream responses from OpenAI, Anthropic, Google Gemini, or any OpenAI-compatible API. Use xHttpClientDoSse() with POST method and JSON body.

  2. Real-Time Notifications — Subscribe to server push notifications (chat messages, stock prices, IoT sensor data) via SSE endpoints.

  3. Log Streaming — Tail remote log streams delivered as SSE events.

Best Practices

  • Use xHttpClientDoSse() for LLM APIs. Most LLM APIs require POST with a JSON body and custom headers. GetSse is only for simple GET endpoints.
  • Handle [DONE] signals. Many LLM APIs send a special [DONE] data payload to signal the end of the stream. Return non-zero from on_event to close cleanly.
  • Set appropriate timeouts. Streaming responses can take a long time. Set timeout_ms high enough (e.g., 60000ms) to avoid premature timeouts.
  • Don't block in on_event. The callback runs on the event loop thread. Blocking delays all other I/O.
  • Copy event data if needed. xSseEvent pointers are valid only during the callback.

Comparison with Other Libraries

Featurexhttp SSEeventsource (JS)sseclient-pylibcurl (manual)
Spec ComplianceW3C SSEW3C SSEW3C SSEManual parsing
IntegrationxEventLoop (async)Browser event loopBlocking iteratorManual
POST SupportYes (DoSse)No (GET only)No (GET only)Manual
CancellationCallback return valueclose()Break loopcurl_easy_pause
Multi-line DataAuto-joined with \nAuto-joinedAuto-joinedManual
LanguageC99JavaScriptPythonC

Key Differentiator: xhttp's SSE implementation is unique in supporting POST-based SSE (via xHttpClientDoSse), which is essential for LLM API integration. Most SSE libraries only support GET. The incremental parser integrates seamlessly with the event loop, delivering events as they arrive without buffering the entire stream.

TLS Deployment Guide

This guide covers end-to-end TLS deployment for xhttp, including certificate generation, server and client configuration, and mutual TLS (mTLS). For API reference, see server.md and client.md.

Prerequisites

  • OpenSSL CLI — Used for certificate generation (openssl command).
  • TLS backend compiled — xKit must be built with XK_TLS_BACKEND=openssl (or mbedtls). Without a TLS backend, xHttpServerListenTls() returns xErrno_NotSupported.

Check your build:

# If XK_HAS_OPENSSL is defined, TLS is available
grep -r "XK_HAS_OPENSSL" xhttp/

Certificate Generation

Self-Signed Certificate (Development)

For quick local development and testing:

openssl req -x509 -newkey rsa:2048 \
  -keyout server-key.pem \
  -out server.pem \
  -days 365 -nodes \
  -subj '/CN=localhost'

This produces:

  • server.pem — Self-signed certificate
  • server-key.pem — Unencrypted private key

Note: Self-signed certificates are not trusted by default. Clients must either set skip_verify = 1 or provide the certificate as a CA via ca.

CA-Signed Certificates (Production / mTLS)

For mutual TLS or production-like setups, create a private CA and sign both server and client certificates.

Step 1: Create a CA

# Generate CA private key and self-signed certificate
openssl req -x509 -newkey rsa:2048 \
  -keyout ca-key.pem \
  -out ca.pem \
  -days 365 -nodes \
  -subj '/CN=MyCA'

Step 2: Generate Server Certificate

# Generate server key + CSR
openssl req -newkey rsa:2048 \
  -keyout server-key.pem \
  -out server.csr \
  -nodes \
  -subj '/CN=localhost'

# Sign with CA
openssl x509 -req \
  -in server.csr \
  -CA ca.pem -CAkey ca-key.pem -CAcreateserial \
  -out server.pem \
  -days 365

# Clean up CSR
rm server.csr

Step 3: Generate Client Certificate (for mTLS)

# Generate client key + CSR
openssl req -newkey rsa:2048 \
  -keyout client-key.pem \
  -out client.csr \
  -nodes \
  -subj '/CN=MyClient'

# Sign with the same CA
openssl x509 -req \
  -in client.csr \
  -CA ca.pem -CAkey ca-key.pem -CAcreateserial \
  -out client.pem \
  -days 365

# Clean up CSR
rm client.csr

After these steps you have:

FileDescription
ca.pemCA certificate (trusted by both sides)
ca-key.pemCA private key (keep secure, not deployed)
server.pemServer certificate (signed by CA)
server-key.pemServer private key
client.pemClient certificate (signed by CA)
client-key.pemClient private key

Deployment Scenarios

1. One-Way TLS (Server Authentication Only)

The most common setup: the client verifies the server's identity, but the server does not verify the client.

sequenceDiagram
    participant Client
    participant Server

    Client->>Server: TLS ClientHello
    Server->>Client: Certificate (server.pem)
    Client->>Client: Verify server cert against CA
    Client->>Server: Finished
    Server->>Client: Finished
    Note over Client,Server: Encrypted HTTP traffic

Server:

xTlsConf tls = {
    .cert = "server.pem",
    .key  = "server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

Client (with CA verification):

xTlsConf tls = {0};
tls.ca = "ca.pem";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
    xHttpClientCreate(loop, &conf);

xHttpClientGet(
    client,
    "https://localhost:8443/hello",
    on_response, NULL);

Client (skip verification — development only):

xTlsConf tls = {0};
tls.skip_verify = 1;
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
    xHttpClientCreate(loop, &conf);

2. Mutual TLS (mTLS)

Both sides authenticate each other. The server requires a valid client certificate signed by a trusted CA.

sequenceDiagram
    participant Client
    participant Server

    Client->>Server: TLS ClientHello
    Server->>Client: Certificate (server.pem) + CertificateRequest
    Client->>Client: Verify server cert against CA
    Client->>Server: Certificate (client.pem)
    Server->>Server: Verify client cert against CA
    Client->>Server: Finished
    Server->>Client: Finished
    Note over Client,Server: Mutually authenticated encrypted traffic

Server:

xTlsConf tls = {
    .cert     = "server.pem",
    .key      = "server-key.pem",
    .ca       = "ca.pem",       // CA to verify client certs
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

Client:

xTlsConf tls = {0};
tls.ca   = "ca.pem";
tls.cert = "client.pem";
tls.key  = "client-key.pem";
xHttpClientConf conf = {.tls = &tls};
xHttpClient client =
    xHttpClientCreate(loop, &conf);

xHttpClientGet(
    client,
    "https://localhost:8443/secure",
    on_response, NULL);

3. HTTP + HTTPS on Different Ports

A single xHttpServer can serve both cleartext HTTP and HTTPS simultaneously:

// HTTP on port 8080
xHttpServerListen(server, "0.0.0.0", 8080);

// HTTPS on port 8443
xTlsConf tls = {
    .cert = "server.pem",
    .key  = "server-key.pem",
};
xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

Routes are shared — the same handlers serve both HTTP and HTTPS traffic.

Complete End-to-End Example

A full working example: CA-signed mTLS with server and client.

Generate Certificates

#!/bin/bash
set -e

# CA
openssl req -x509 -newkey rsa:2048 \
  -keyout ca-key.pem -out ca.pem \
  -days 365 -nodes -subj '/CN=TestCA'

# Server
openssl req -newkey rsa:2048 \
  -keyout server-key.pem -out server.csr \
  -nodes -subj '/CN=localhost'
openssl x509 -req -in server.csr \
  -CA ca.pem -CAkey ca-key.pem -CAcreateserial \
  -out server.pem -days 365
rm server.csr

# Client
openssl req -newkey rsa:2048 \
  -keyout client-key.pem -out client.csr \
  -nodes -subj '/CN=MyClient'
openssl x509 -req -in client.csr \
  -CA ca.pem -CAkey ca-key.pem -CAcreateserial \
  -out client.pem -days 365
rm client.csr

echo "Generated: ca.pem, server.pem, server-key.pem, client.pem, client-key.pem"

Server Code

#include <stdio.h>
#include <string.h>
#include <xbase/event.h>
#include <xhttp/server.h>

static void on_secure(xHttpResponseWriter w, const xHttpRequest *req, void *arg) {
    (void)req; (void)arg;
    xHttpResponseSetHeader(w, "Content-Type", "text/plain");
    xHttpResponseSend(w, "mTLS OK!\n", 9);
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();
    xHttpServer server = xHttpServerCreate(loop);

    xHttpServerRoute(server, "GET /secure", on_secure, NULL);

    xTlsConf tls = {
        .cert     = "server.pem",
        .key      = "server-key.pem",
        .ca       = "ca.pem",
    };
    xHttpServerListenTls(server, "0.0.0.0", 8443, &tls);

    printf("mTLS server listening on :8443\n");
    xEventLoopRun(loop);

    xHttpServerDestroy(server);
    xEventLoopDestroy(loop);
    return 0;
}

Client Code

#include <stdio.h>
#include <xbase/event.h>
#include <xhttp/client.h>

static void on_response(const xHttpResponse *resp, void *arg) {
    (void)arg;
    if (resp->curl_code == 0) {
        printf("HTTP %ld: %.*s\n", resp->status_code,
               (int)resp->body_len, resp->body);
    } else {
        printf("TLS error: %s\n", resp->curl_error);
    }
}

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xTlsConf tls = {0};
    tls.ca   = "ca.pem";
    tls.cert = "client.pem";
    tls.key  = "client-key.pem";
    xHttpClientConf conf = {.tls = &tls};
    xHttpClient client =
        xHttpClientCreate(loop, &conf);

    xHttpClientGet(client, "https://localhost:8443/secure",
                   on_response, NULL);

    xEventLoopRun(loop);
    xHttpClientDestroy(client);
    xEventLoopDestroy(loop);
    return 0;
}

Verify with curl

# One-way TLS (skip verify)
curl -k https://localhost:8443/secure

# One-way TLS (with CA)
curl --cacert ca.pem https://localhost:8443/secure

# mTLS
curl --cacert ca.pem \
     --cert client.pem \
     --key client-key.pem \
     https://localhost:8443/secure

skip_verify Behavior

ValueBehavior
0 (default)Peer verification enabled. Server verifies client cert (if ca is set); client verifies server cert.
non-zeroAll peer verification disabled. Development only.

ALPN and HTTP/2 over TLS

When TLS is enabled, ALPN (Application-Layer Protocol Negotiation) automatically selects the HTTP protocol:

  • If the client supports HTTP/2, ALPN negotiates h2 and the connection uses HTTP/2 framing.
  • Otherwise, ALPN falls back to http/1.1.

This is transparent to application code — the same routes and handlers work regardless of the negotiated protocol.

Troubleshooting

SymptomCauseFix
xErrno_NotSupported from ListenTlsNo TLS backend compiledRebuild with XK_TLS_BACKEND=openssl
Client gets curl_code != 0, status_code == 0TLS handshake failedCheck cert paths, CA trust, and skip_verify settings
Self-signed cert rejectedClient verifies against system CA bundleSet ca to the self-signed cert, or use skip_verify = 1 for dev
mTLS handshake failsClient didn't provide cert, or cert not signed by server's caEnsure client cert is signed by the same CA specified in server's ca
"wrong CA path" errorca points to non-existent fileVerify the file path exists and is readable
Connection works with skip_verify but not withoutServer cert CN doesn't match hostname, or CA not trustedUse ca pointing to the signing CA, ensure CN matches the hostname

Security Best Practices

  1. Never use skip_verify in production. It disables all certificate validation, making the connection vulnerable to MITM attacks.
  2. Keep private keys secure. ca-key.pem, server-key.pem, and client-key.pem should have restricted file permissions (chmod 600).
  3. Use short-lived certificates. Set reasonable expiry (-days) and rotate certificates before they expire.
  4. For mTLS, set ca on the server side. Verification is enabled by default (skip_verify = 0), so the server will require a valid client certificate when ca is set.
  5. Don't deploy the CA private key. Only ca.pem (the public certificate) needs to be distributed. Keep ca-key.pem offline or in a secure vault.
  6. Match CN/SAN to hostname. The server certificate's Common Name (or Subject Alternative Name) should match the hostname clients use to connect.

API Quick Reference

Server Side

ItemDescription
xTlsConfStruct: cert, key, ca, key_password, alpn, skip_verify
xHttpServerListenTls()Start HTTPS listener with TLS config

Client Side

ItemDescription
xTlsConfStruct: cert, key, ca, key_password, alpn, skip_verify
xHttpClientConfStruct: tls (pointer to xTlsConf), http_version
xHttpClientCreate()Create client with TLS config via xHttpClientConf.

WebSocket Client Side

ItemDescription
xTlsConfStruct: cert, key, ca, key_password, alpn, skip_verify
xTlsCtxOpaque shared TLS context from xTlsCtxCreate()
xWsConnectConfStruct: tls (pointer to xTlsConf), tls_ctx (shared context, priority over tls)
xWsConnect()Initiate async WebSocket connection with optional TLS.

For full API details, see server.md and client.md.

xlog — Async Logging

Introduction

xlog is xKit's high-performance asynchronous logging module. It formats log entries on the calling thread and flushes them to a file (or stderr) on the event loop thread, decoupling I/O latency from application logic. Three operating modes — Timer, Notify, and Mixed — offer different trade-offs between flush latency and overhead.

Design Philosophy

  1. Async by Default — Log messages are formatted on the calling thread and enqueued via a lock-free MPSC queue. The event loop thread drains the queue and writes to disk, ensuring that logging never blocks the caller (except for Fatal level).

  2. Three Modes for Different Needs — Timer mode batches writes for throughput; Notify mode uses a pipe for low-latency delivery; Mixed mode combines both, using the timer for normal messages and the pipe for high-severity entries.

  3. Event Loop Integration — The logger is bound to an xEventLoop and uses its timer and I/O facilities. This means no dedicated logging thread — the event loop thread handles both I/O and log flushing.

  4. Thread-Local ContextxLoggerEnter() sets the current thread's logger, enabling the XLOG_*() macros and bridging xbase's internal xLog() calls to the async pipeline.

Architecture

graph TD
    subgraph "Application Threads"
        T1["Thread 1<br/>xLoggerLog()"]
        T2["Thread 2<br/>XLOG_INFO()"]
        T3["Thread 3<br/>xLog() (xbase internal)"]
    end

    subgraph "Lock-Free Queue"
        MPSC["MPSC Queue<br/>(xbase/mpsc.h)"]
    end

    subgraph "Event Loop Thread"
        TIMER["Timer Callback<br/>(periodic flush)"]
        PIPE["Pipe Callback<br/>(immediate flush)"]
        FLUSH["logger_flush_entries()"]
        WRITE["fwrite() + fflush()"]
        ROTATE["File Rotation"]
    end

    subgraph "Output"
        FILE["Log File"]
        STDERR["stderr"]
    end

    T1 -->|"format + enqueue"| MPSC
    T2 -->|"format + enqueue"| MPSC
    T3 -->|"bridge_callback"| MPSC
    MPSC --> FLUSH
    TIMER --> FLUSH
    PIPE --> FLUSH
    FLUSH --> WRITE
    WRITE --> FILE
    WRITE --> STDERR
    WRITE -->|"max_size exceeded"| ROTATE

    style MPSC fill:#f5a623,color:#fff
    style FLUSH fill:#50b86c,color:#fff

Sub-Module Overview

FileDescriptionDoc
logger.hAsync logger API, macros, and configurationlogger.md

Quick Start

#include <xbase/event.h>
#include <xlog/logger.h>

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xLoggerConf conf = {
        .loop             = loop,
        .path             = "app.log",
        .mode             = xLogMode_Mixed,
        .level            = xLogLevel_Info,
        .max_size         = 10 * 1024 * 1024, // 10MB
        .max_files        = 5,
        .flush_interval_ms = 100,
    };

    xLogger logger = xLoggerCreate(conf);
    xLoggerEnter(logger); // Set as thread-local logger

    XLOG_INFO("Application started, version %d.%d", 1, 0);
    XLOG_WARN("Low memory: %zu bytes remaining", (size_t)1024);

    // Run event loop (processes log flushes)
    xEventLoopRun(loop);

    xLoggerLeave();
    xLoggerDestroy(logger);
    xEventLoopDestroy(loop);
    return 0;
}

Relationship with Other Modules

  • xbase/event.h — The logger is bound to an xEventLoop for timer-driven and pipe-driven flush.
  • xbase/mpsc.h — Uses the lock-free MPSC queue to pass log entries from producer threads to the event loop thread.
  • xbase/log.hxLoggerEnter() bridges xbase's internal xLog() calls to the async logger via the thread-local callback mechanism.
  • xbase/atomic.h — Uses atomic operations for the lock-free entry freelist.

logger.h — High-Performance Async Logger

Introduction

logger.h provides xLogger, a high-performance asynchronous logger that formats log entries on the calling thread and flushes them to a file (or stderr) on the event loop thread. It supports three operating modes (Timer, Notify, Mixed), five severity levels, file rotation, synchronous flush, and seamless bridging with xbase's internal xLog() mechanism.

Design Philosophy

  1. Format on Caller, Write on Loop — Log messages are formatted (snprintf) on the calling thread into a pre-allocated entry buffer, then enqueued via the lock-free MPSC queue. The event loop thread dequeues and writes to disk. This decouples I/O latency from application logic.

  2. Three Operating Modes — Different applications have different latency/throughput requirements:

    • Timer — Periodic flush (default 100ms). Best throughput, highest latency.
    • Notify — Pipe-based immediate notification. Lowest latency, highest overhead.
    • Mixed — Timer for normal messages, pipe for Error/Fatal. Best balance.
  3. Lock-Free Entry Pool — A global Treiber stack freelist recycles log entry structs across all threads, avoiding malloc/free on the hot path.

  4. Fatal = Synchronous + Abort — Fatal-level messages bypass the async queue entirely: they are written directly to the file and followed by abort(). This ensures the fatal message is never lost.

  5. xbase BridgexLoggerEnter() registers a callback with xbase's xLogSetCallback(), routing all internal xKit error messages through the async logger.

Architecture

graph TD
    subgraph "xLogger Internal"
        MPSC["MPSC Queue<br/>(head, tail)"]
        TIMER["xEventLoopTimer<br/>(periodic flush)"]
        PIPE["Pipe<br/>(notify flush)"]
        FLUSH_PIPE["Flush Request Pipe<br/>(sync flush)"]
        FREELIST["Entry Freelist<br/>(Treiber stack)"]
        FP["FILE *fp<br/>(log file or stderr)"]
    end

    subgraph "xbase Dependencies"
        EVENT["xEventLoop"]
        MPSC_LIB["xbase/mpsc.h"]
        ATOMIC_LIB["xbase/atomic.h"]
        LOG_LIB["xbase/log.h"]
    end

    TIMER --> EVENT
    PIPE --> EVENT
    FLUSH_PIPE --> EVENT
    MPSC --> MPSC_LIB
    FREELIST --> ATOMIC_LIB

    style MPSC fill:#f5a623,color:#fff
    style FREELIST fill:#4a90d9,color:#fff

Implementation Details

Three Operating Modes

graph LR
    subgraph "Timer Mode"
        T_ENQUEUE["Enqueue"] --> T_TIMER["Timer fires<br/>(every 100ms)"]
        T_TIMER --> T_FLUSH["Flush all entries"]
    end

    subgraph "Notify Mode"
        N_ENQUEUE["Enqueue"] --> N_PIPE["Write 1 byte to pipe"]
        N_PIPE --> N_LOOP["Pipe readable event"]
        N_LOOP --> N_FLUSH["Flush all entries"]
    end

    subgraph "Mixed Mode"
        M_ENQUEUE["Enqueue"]
        M_ENQUEUE -->|"Debug/Info/Warn"| M_TIMER["Timer fires"]
        M_ENQUEUE -->|"Error/Fatal"| M_PIPE["Write to pipe"]
        M_TIMER --> M_FLUSH["Flush all entries"]
        M_PIPE --> M_FLUSH
    end

    style T_FLUSH fill:#50b86c,color:#fff
    style N_FLUSH fill:#50b86c,color:#fff
    style M_FLUSH fill:#50b86c,color:#fff
ModeFlush TriggerLatencyOverheadBest For
TimerPeriodic timer (default 100ms)Up to flush_interval_msLowest (no per-message syscall)High-throughput logging
NotifyPipe write per message~ImmediateHighest (1 write() per message)Low-latency debugging
MixedTimer + pipe for Error/FatalLow for errors, batched for infoModerateProduction applications

Log Entry Lifecycle

sequenceDiagram
    participant App as Application Thread
    participant Pool as Entry Freelist
    participant Queue as MPSC Queue
    participant L as Event Loop Thread
    participant File as Log File

    App->>Pool: entry_alloc()
    Pool-->>App: "xLogEntry_ (recycled or malloc'd)"
    App->>App: "snprintf(entry->buf, timestamp + level + message)"
    App->>Queue: xMpscPush(entry)
    Note over App: "Optional: write(pipe_wfd, 1) for Notify/Mixed"

    L->>Queue: "xMpscPop() (timer or pipe callback)"
    Queue-->>L: xLogEntry_
    L->>File: "fwrite(entry->buf)"
    L->>Pool: entry_free(entry)
    L->>File: fflush()

Log Entry Structure

struct xLogEntry_ {
    xMpsc           node;       // MPSC queue node
    xLogLevel       level;      // Severity level
    int             len;        // Formatted message length
    char            buf[XLOG_ENTRY_BUF_SIZE]; // Formatted message (512 bytes)
    struct xLogEntry_ *free_next; // Freelist link
};

Lock-Free Entry Freelist

The freelist uses a Treiber stack with atomic CAS:

  • Alloc: Pop from freelist head (CAS loop). Fallback to malloc() if empty.
  • Free: Push to freelist head (CAS loop). If count exceeds XLOG_FREELIST_SIZE, call free() instead.

The count check is intentionally racy (soft cap) to keep the fast path lean.

File Rotation

When written >= max_size and max_files > 1:

  1. Delete path.{max_files-1} (oldest)
  2. Cascade rename: path.{i-1}path.{i} for i = max_files-1 down to 2
  3. Rename pathpath.1
  4. Reopen path in append mode
app.log      → app.log.1
app.log.1    → app.log.2
app.log.2    → app.log.3
app.log.3    → (deleted if max_files=4)

Synchronous Flush

xLoggerFlush() writes a byte to a dedicated flush-request pipe, triggering logger_flush_req_cb on the event loop thread. The caller then busy-waits (polling xMpscEmpty() every 1ms, up to 1 second) until the queue is drained.

Log Format

2025-04-04 16:30:00.123 INFO  Application started
2025-04-04 16:30:00.456 WARN  Low memory: 1024 bytes remaining
2025-04-04 16:30:01.789 ERROR Connection refused

Format: YYYY-MM-DD HH:MM:SS.mmm LEVEL message\n

API Reference

Types

TypeDescription
xLoggerOpaque handle to an async logger
xLogLevelEnum: Debug, Info, Warn, Error, Fatal
xLogModeEnum: Timer, Notify, Mixed
xLoggerConfConfiguration struct for creating a logger

xLoggerConf Fields

FieldTypeDefaultDescription
loopxEventLoop(required)Event loop for timer/pipe callbacks
pathconst char *NULL (stderr)Log file path
modexLogModeTimerOperating mode
levelxLogLevelInfoMinimum log level
max_sizesize_t0 (no rotation)Max file size before rotation
max_filesint0 (no rotation)Total files to keep (including current)
flush_interval_msuint64_t100Timer/Mixed flush interval

Functions

FunctionSignatureDescriptionThread Safety
xLoggerCreatexLogger xLoggerCreate(xLoggerConf conf)Create a logger.Not thread-safe
xLoggerDestroyvoid xLoggerDestroy(xLogger logger)Flush remaining entries and destroy.Not thread-safe
xLoggerLogvoid xLoggerLog(xLogger logger, xLogLevel level, const char *fmt, ...)Write a log entry. Fatal is synchronous + abort.Thread-safe
xLoggerFlushvoid xLoggerFlush(xLogger logger)Synchronously flush all pending entries.Thread-safe
xLoggerEntervoid xLoggerEnter(xLogger logger)Set as thread-local logger + bridge xbase log.Thread-local
xLoggerLeavevoid xLoggerLeave(void)Clear thread-local logger.Thread-local
xLoggerCurrentxLogger xLoggerCurrent(void)Get current thread's logger.Thread-local

Convenience Macros

Using thread-local logger (set via xLoggerEnter):

MacroExpands To
XLOG_DEBUG(fmt, ...)xLoggerLog(xLoggerCurrent(), xLogLevel_Debug, fmt, ...)
XLOG_INFO(fmt, ...)xLoggerLog(xLoggerCurrent(), xLogLevel_Info, fmt, ...)
XLOG_WARN(fmt, ...)xLoggerLog(xLoggerCurrent(), xLogLevel_Warn, fmt, ...)
XLOG_ERROR(fmt, ...)xLoggerLog(xLoggerCurrent(), xLogLevel_Error, fmt, ...)
XLOG_FATAL(fmt, ...)xLoggerLog(xLoggerCurrent(), xLogLevel_Fatal, fmt, ...)

Explicit logger variants: XLOG_DEBUG_L(logger, fmt, ...), etc.

Usage Examples

Basic File Logging

#include <xbase/event.h>
#include <xlog/logger.h>

int main(void) {
    xEventLoop loop = xEventLoopCreate();

    xLoggerConf conf = {
        .loop  = loop,
        .path  = "app.log",
        .mode  = xLogMode_Timer,
        .level = xLogLevel_Info,
    };

    xLogger logger = xLoggerCreate(conf);
    xLoggerEnter(logger);

    XLOG_INFO("Server started on port %d", 8080);
    XLOG_DEBUG("This is filtered out (level < Info)");
    XLOG_WARN("Connection pool at %d%% capacity", 85);

    xEventLoopRun(loop);

    xLoggerLeave();
    xLoggerDestroy(logger);
    xEventLoopDestroy(loop);
    return 0;
}

File Rotation Example

xLoggerConf conf = {
    .loop      = loop,
    .path      = "/var/log/myapp.log",
    .mode      = xLogMode_Mixed,
    .level     = xLogLevel_Info,
    .max_size  = 50 * 1024 * 1024, // 50MB per file
    .max_files = 10,                // Keep 10 files (500MB total)
};

Multi-Threaded Logging

#include <pthread.h>
#include <xlog/logger.h>

static xLogger g_logger;

static void *worker(void *arg) {
    int id = *(int *)arg;
    xLoggerEnter(g_logger); // Each thread must enter

    for (int i = 0; i < 1000; i++) {
        XLOG_INFO("Worker %d: iteration %d", id, i);
    }

    xLoggerLeave();
    return NULL;
}

// In main():
// g_logger = xLoggerCreate(conf);
// pthread_create(&threads[i], NULL, worker, &ids[i]);

Synchronous Flush Before Exit

void graceful_shutdown(xLogger logger) {
    XLOG_INFO("Shutting down...");
    xLoggerFlush(logger); // Block until all entries are written
    xLoggerDestroy(logger);
}

Use Cases

  1. Application Logging — Primary use case: structured, async logging for server applications with file rotation and level filtering.

  2. xKit Internal Error Capture — Via xLoggerEnter(), all xKit internal errors (from xLog()) are automatically routed through the async logger.

  3. Debug Logging — Use xLogMode_Notify during development for immediate log output without timer delay.

Best Practices

  • Call xLoggerEnter() on every thread that uses XLOG_*() macros. Each thread needs its own thread-local context.
  • Use Mixed mode for production. It provides the best balance: batched writes for normal messages, immediate notification for errors.
  • Set appropriate rotation limits. Without rotation (max_size = 0), log files grow unbounded.
  • Call xLoggerFlush() before shutdown to ensure all pending messages are written.
  • Don't log in tight loops at Debug level without checking the level first. While the level filter is cheap, formatting still costs CPU.
  • Fatal messages are synchronous. XLOG_FATAL() writes directly and calls abort(). Don't rely on async delivery for fatal messages.

Comparison with Other Libraries

Featurexlog logger.hspdlogzloglog4c
LanguageC99C++11CC
Async ModelMPSC queue + event loopDedicated thread + queueDedicated threadSynchronous
ModesTimer / Notify / MixedAsync (thread pool)Async (thread)Sync only
Lock-FreeYes (MPSC + Treiber stack)Yes (MPMC queue)No (mutex)No (mutex)
Event LoopIntegrated (xEventLoop)None (own thread)None (own thread)None
File RotationSize-based (cascade rename)Size-basedSize/time-basedSize-based
Formatprintf-stylefmt-style / printfprintf-styleprintf-style
Thread-Local ContextYes (xLoggerEnter)NoYes (MDC)Yes (NDC)
Fatal HandlingSync write + abortFlush + abortConfigurableConfigurable

Key Differentiator: xlog is unique in integrating with an event loop rather than spawning a dedicated logging thread. This means the same thread that handles network I/O also handles log flushing, reducing context switches and thread count. The three-mode design (Timer/Notify/Mixed) gives fine-grained control over the latency/throughput trade-off that most logging libraries don't offer.

Benchmark

End-to-end benchmarks for xKit, measuring real-world performance across complete scenarios.

All benchmarks run on Apple M3 Pro (12 cores, 36 GB), macOS 26.4, Clang 17, Release (-O2).

For micro-benchmark results, see the Benchmark section at the bottom of each module's documentation page.

Available Benchmarks

BenchmarkDescription
HTTP ServerxKit single-threaded HTTP/1.1 server vs Go net/http152 K req/s, +15–60% faster across all scenarios
HTTP/2 ServerxKit single-threaded h2c server vs Go net/http + x/net/http2576 K req/s, +15–405% faster across all scenarios
HTTPS ServerxKit single-threaded HTTPS server vs Go net/http + crypto/tls512 K req/s (HTTPS/2), TLS-bound parity on HTTPS/1.1

HTTP Server Benchmark

End-to-end HTTP/1.1 server benchmark comparing xKit (single-threaded event-loop) against Go net/http (goroutine-per-connection).

Test Environment

ItemValue
CPUApple M3 Pro (12 cores)
Memory36 GB
OSmacOS 26.4 (Darwin)
CompilerApple Clang 17.0.0
BuildRelease (-O2)
Load Generatorwrk — 4 threads, 10s duration

Server Implementations

xKit (bench/http_bench_server.cpp)

Single-threaded event-loop HTTP/1.1 server built on xbase/event.h + xhttp/server.h. Uses kqueue on macOS, epoll on Linux. All I/O is handled in one thread — no thread pool, no goroutines.

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
./build/bench/http_bench_server 8080

Go (bench/http_bench_server.go)

Standard net/http server with default settings. Go's runtime spawns one goroutine per connection and uses its own epoll/kqueue poller internally.

go build -o build/bench/go_http_bench bench/http_bench_server.go
./build/bench/go_http_bench 8081

Routes

Both servers implement identical routes:

RouteMethodDescription
/pingGETReturns "pong" (4 bytes) — minimal response latency test
/echo?size=NGETReturns N bytes of 'x' — variable response size test
/echoPOSTEchoes request body — request body throughput test

Benchmark Methodology

All benchmarks use wrk with the following defaults unless noted:

  • 4 threads (-t4)
  • 100 connections (-c100)
  • 10 seconds (-d10s)

POST benchmarks use Lua scripts to set the request body:

wrk.method = "POST"
wrk.headers["Content-Type"] = "application/octet-stream"
wrk.body = string.rep("x", BODY_SIZE)

Results

GET /ping — Minimal Response Latency

Tests raw request/response overhead with a 4-byte "pong" response. Varies connection count to measure scalability.

ConnectionsxKit Req/sGo Req/sxKit LatencyGo LatencyΔ
50151,935128,639315 μs365 μsxKit +18%
100152,316128,915658 μs761 μsxKit +18%
200151,007128,1621.33 ms1.55 msxKit +18%
500155,486125,4713.20 ms3.96 msxKit +24%

Analysis:

  • xKit maintains ~152K req/s regardless of connection count, showing excellent scalability of the single-threaded event loop.
  • Go's throughput slightly degrades at 500 connections due to goroutine scheduling overhead.
  • xKit's advantage grows from +18% to +24% as connection count increases — the event loop's O(1) dispatch scales better than goroutine context switching.

GET /echo — Variable Response Size

Tests response serialization throughput with different payload sizes. Fixed at 100 connections.

Response SizexKit Req/sGo Req/sxKit LatencyGo LatencyΔ
64 B150,592127,432666 μs771 μsxKit +18%
256 B146,487126,907682 μs774 μsxKit +15%
1 KiB144,831125,729689 μs785 μsxKit +15%
4 KiB141,51191,886707 μs1.08 msxKit +54%

Analysis:

  • xKit throughput degrades gracefully from 151K to 142K req/s as response size grows from 64B to 4KB — only a 6% drop.
  • Go drops sharply at 4KB (92K req/s, -27% from 64B), likely due to bytes.Repeat allocation pressure and GC overhead.
  • xKit's largest advantage (+54%) appears at 4KB, where Go's per-request heap allocation becomes the bottleneck.

POST /echo — Request Body Throughput

Tests request body parsing and echo throughput. Fixed at 100 connections.

Body SizexKit Req/sGo Req/sxKit Transfer/sGo Transfer/sΔ
1 KiB141,495122,584152.35 MB/s133.51 MB/sxKit +15%
4 KiB133,93583,512536.60 MB/s337.13 MB/sxKit +60%
16 KiB82,23153,8281.26 GB/s848.10 MB/sxKit +53%
64 KiB35,90831,1242.20 GB/s1.90 GB/sxKit +15%

Analysis:

  • xKit achieves 2.20 GB/s transfer rate at 64KB body size — impressive for a single-threaded server.
  • The largest advantage (+60%) appears at 4KB, consistent with the GET /echo pattern — Go's allocation overhead dominates at medium payload sizes.
  • At 64KB, the gap narrows to +15% as both servers become I/O bound (kernel socket buffer management dominates).

Summary

                    xKit vs Go net/http (Release build)
                    ====================================

  GET /ping:     xKit +18% ~ +24%   (consistent across all concurrency levels)
  GET /echo:     xKit +15% ~ +54%   (advantage grows with response size)
  POST /echo:    xKit +15% ~ +60%   (advantage peaks at medium body sizes)

  Peak throughput:  xKit 155K req/s (GET /ping, 500 connections)
  Peak transfer:    xKit 2.20 GB/s  (POST /echo, 64KB body)

Key Takeaways:

  1. xKit wins every scenario. A single-threaded C event loop outperforms Go's multi-goroutine runtime across all request types and payload sizes.
  2. Scalability. xKit's throughput is nearly flat from 50 to 500 connections. Go degrades under high connection counts due to goroutine scheduling overhead.
  3. Payload efficiency. xKit's advantage is most pronounced at medium payloads (1–4 KiB) where Go's per-request heap allocation and GC pressure become significant.
  4. Architecture matters. xKit's single-threaded design eliminates all synchronization overhead. Go pays for goroutine creation, scheduling, and garbage collection on every request.

Reproducing

# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel

# Build Go server
go build -o build/bench/go_http_bench bench/http_bench_server.go

# Run xKit benchmark
./build/bench/http_bench_server 8080 &
wrk -t4 -c100 -d10s http://127.0.0.1:8080/ping
wrk -t4 -c100 -d10s "http://127.0.0.1:8080/echo?size=64"
wrk -t4 -c100 -d10s "http://127.0.0.1:8080/echo?size=4096"

# POST with lua script
cat > /tmp/post.lua << 'EOF'
wrk.method = "POST"
wrk.headers["Content-Type"] = "application/octet-stream"
wrk.body = string.rep("x", 4096)
EOF
wrk -t4 -c100 -d10s -s /tmp/post.lua http://127.0.0.1:8080/echo

# Run Go benchmark (same wrk commands, different port)
./build/bench/go_http_bench 8081 &
wrk -t4 -c100 -d10s http://127.0.0.1:8081/ping

HTTP/2 Server Benchmark

End-to-end HTTP/2 (h2c, cleartext) server benchmark comparing xKit (single-threaded event-loop) against Go net/http + x/net/http2/h2c (goroutine-per-connection).

Test Environment

ItemValue
CPUApple M3 Pro (12 cores)
Memory36 GB
OSmacOS 26.4 (Darwin)
CompilerApple Clang 17.0.0
BuildRelease (-O2)
Load Generatorh2load (nghttp2 1.68.1) — 4 threads, 10s duration, 10 max concurrent streams per connection

Server Implementations

xKit (bench/http_bench_server.cpp)

Single-threaded event-loop HTTP/2 server built on xbase/event.h + xhttp/server.h. Supports h2c (cleartext HTTP/2) via Prior Knowledge — the same binary as the HTTP/1.1 benchmark, since xKit auto-detects the protocol on the first bytes of each connection.

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
./build/bench/http_bench_server 8080

Go (bench/h2c_bench_server.go)

Standard net/http server wrapped with golang.org/x/net/http2/h2c.NewHandler() to support cleartext HTTP/2 via Prior Knowledge. Go's runtime spawns one goroutine per connection and uses its own epoll/kqueue poller internally.

cd bench && go build -o ../build/bench/go_h2c_bench h2c_bench_server.go
./build/bench/go_h2c_bench 8081

Routes

Both servers implement identical routes:

RouteMethodDescription
/pingGETReturns "pong" (4 bytes) — minimal response latency test
/echo?size=NGETReturns N bytes of 'x' — variable response size test
/echoPOSTEchoes request body — request body throughput test

Benchmark Methodology

All benchmarks use h2load with the following defaults unless noted:

  • 4 threads (-t4)
  • 100 connections (-c100)
  • 10 max concurrent streams per connection (-m10)
  • 10 seconds (-D 10)

POST benchmarks use -d <file> to specify the request body.

Why h2load? Unlike wrk (HTTP/1.1 only), h2load is purpose-built for HTTP/2 benchmarking. It supports stream multiplexing (-m), h2c Prior Knowledge, and reports per-stream latency.

Results

GET /ping — Minimal Response Latency

Tests raw request/response overhead with a 4-byte "pong" response. Varies connection count to measure scalability under HTTP/2 multiplexing.

ConnectionsxKit Req/sGo Req/sxKit LatencyGo LatencyΔ
50576,249141,655863 μs3.51 msxKit +307%
100561,825120,7321.78 ms8.27 msxKit +365%
200555,800110,1433.59 ms18.10 msxKit +405%
500538,905136,7199.22 ms36.21 msxKit +294%

Analysis:

  • xKit sustains ~560K req/s across all connection counts — a massive improvement over its HTTP/1.1 numbers (~152K) thanks to HTTP/2 stream multiplexing on fewer TCP connections.
  • Go's h2c throughput (~110–142K) is comparable to its HTTP/1.1 numbers, suggesting Go's HTTP/2 implementation doesn't benefit as much from multiplexing.
  • xKit's advantage ranges from +294% to +405% — far larger than the +18–24% gap seen in HTTP/1.1. The single-threaded event loop excels at handling multiplexed streams without context-switching overhead.
  • At 200 connections, xKit's advantage peaks at +405%. Go's throughput degrades more steeply under high connection counts due to goroutine scheduling and HTTP/2 flow control overhead.

GET /echo — Variable Response Size

Tests response serialization throughput with different payload sizes under HTTP/2 framing. Fixed at 100 connections.

Response SizexKit Req/sGo Req/sxKit LatencyGo LatencyΔ
64 B518,176123,3861.92 ms8.08 msxKit +320%
256 B511,276116,2671.95 ms8.60 msxKit +340%
1 KiB493,405115,2672.03 ms8.64 msxKit +328%
4 KiB383,507107,4572.59 ms9.23 msxKit +257%

Analysis:

  • xKit throughput degrades gracefully from 518K to 384K req/s as response size grows from 64B to 4KB — a 26% drop, mostly due to HTTP/2 DATA frame serialization overhead.
  • Go stays relatively flat (~107–123K) but at a much lower baseline. The bytes.Repeat allocation + GC pressure is compounded by HTTP/2 framing overhead.
  • xKit's advantage is consistently +257% to +340% — HTTP/2's HPACK header compression and binary framing amplify xKit's architectural advantage over Go.

POST /echo — Request Body Throughput

Tests request body parsing and echo throughput under HTTP/2. Fixed at 100 connections.

Body SizexKit Req/sGo Req/sxKit Transfer/sGo Transfer/sΔ
1 KiB401,047119,739399.45 MB/s119.82 MB/sxKit +235%
4 KiB195,22190,585766.61 MB/s356.84 MB/sxKit +115%
16 KiB57,30441,313896.83 MB/s648.24 MB/sxKit +39%
64 KiB19,04016,5571.16 GB/s1.01 GB/sxKit +15%

Analysis:

  • xKit achieves 1.16 GB/s transfer rate at 64KB body size — comparable to its HTTP/1.1 performance (2.20 GB/s), with the difference attributable to HTTP/2 flow control and framing overhead.
  • The advantage narrows from +235% (1KB) to +15% (64KB) as both servers become I/O bound. HTTP/2 flow control (default 64KB window) becomes the bottleneck at large payloads.
  • At small payloads (1KB), xKit's +235% advantage shows the efficiency of its nghttp2-based H2 implementation vs Go's x/net/http2.

HTTP/2 vs HTTP/1.1 Comparison

How does HTTP/2 compare to HTTP/1.1 for each server? (GET /ping, 100 connections)

ServerHTTP/1.1 Req/sHTTP/2 Req/sΔ
xKit152,316561,825+269%
Go128,915120,732−6%

Key Insight: xKit's single-threaded event loop benefits enormously from HTTP/2 multiplexing — handling multiple streams on fewer connections eliminates per-connection overhead. Go's goroutine-per-connection model doesn't gain from multiplexing because it already handles concurrency at the goroutine level; the added HTTP/2 framing overhead actually causes a slight regression.

Summary

                    xKit vs Go h2c (Release build, h2load -m10)
                    =============================================

  GET /ping:     xKit +294% ~ +405%   (massive advantage across all concurrency)
  GET /echo:     xKit +257% ~ +340%   (consistent across all response sizes)
  POST /echo:    xKit +15%  ~ +235%   (advantage narrows as payloads grow)

  Peak throughput:  xKit 576K req/s  (GET /ping, 50 connections)
  Peak transfer:    xKit 1.16 GB/s   (POST /echo, 64KB body)

Key Takeaways:

  1. HTTP/2 amplifies xKit's advantage. The gap widens from +18–24% (HTTP/1.1) to +294–405% (HTTP/2) on GET /ping. Stream multiplexing plays to the strengths of a single-threaded event loop.
  2. xKit scales with multiplexing. xKit's throughput jumps from 152K (HTTP/1.1) to 576K (HTTP/2) req/s — a 3.8× improvement. Go's throughput stays flat or slightly regresses.
  3. Payload efficiency. At small-to-medium payloads, xKit's nghttp2-based H2 implementation is dramatically faster. At large payloads (64KB), both servers converge as I/O and flow control dominate.
  4. Architecture matters even more for H2. HTTP/2's stream multiplexing, HPACK compression, and flow control add complexity that a lean C event loop handles more efficiently than Go's runtime.

Reproducing

# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel

# Build Go h2c server
cd bench && go build -o ../build/bench/go_h2c_bench h2c_bench_server.go && cd ..

# Install h2load (macOS)
brew install nghttp2

# Start servers
./build/bench/http_bench_server 8080 &
./build/bench/go_h2c_bench 8081 &

# GET /ping benchmark
h2load -t4 -c100 -m10 -D 10 http://127.0.0.1:8080/ping
h2load -t4 -c100 -m10 -D 10 http://127.0.0.1:8081/ping

# GET /echo benchmark
h2load -t4 -c100 -m10 -D 10 "http://127.0.0.1:8080/echo?size=1024"
h2load -t4 -c100 -m10 -D 10 "http://127.0.0.1:8081/echo?size=1024"

# POST /echo benchmark (create body file first)
dd if=/dev/zero bs=4096 count=1 | tr '\0' 'x' > /tmp/body_4k.bin
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin http://127.0.0.1:8080/echo
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin http://127.0.0.1:8081/echo

# Cleanup
pkill -f http_bench_server
pkill -f go_h2c_bench

HTTPS Server Benchmark

End-to-end HTTPS server benchmark comparing xKit (single-threaded event-loop, OpenSSL) against Go net/http + crypto/tls (goroutine-per-connection). Tests both HTTPS/1.1 (wrk) and HTTPS/2 (h2load with ALPN).

Test Environment

ItemValue
CPUApple M3 Pro (12 cores)
Memory36 GB
OSmacOS 26.4 (Darwin)
CompilerApple Clang 17.0.0
BuildRelease (-O2)
TLS BackendOpenSSL 3.6.1 (xKit), Go crypto/tls (Go)
CertificateRSA 2048-bit self-signed, TLS 1.3
Load Generatorwrk (HTTP/1.1 over TLS), h2load (HTTP/2 over TLS with ALPN)

Server Implementations

xKit (bench/https_bench_server.cpp)

Single-threaded event-loop HTTPS server built on xbase/event.h + xhttp/server.h + OpenSSL. Uses xHttpServerListenTls() which automatically sets ALPN to {"h2", "http/1.1"}, so the same server handles both HTTPS/1.1 and HTTPS/2 depending on client negotiation.

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel
openssl req -x509 -newkey rsa:2048 -keyout bench_key.pem -out bench_cert.pem \
  -days 365 -nodes -subj '/CN=localhost'
./build/bench/https_bench_server 8443 bench_cert.pem bench_key.pem

Go (bench/https_bench_server.go)

Standard net/http server with crypto/tls and x/net/http2.ConfigureServer(). Go's TLS implementation is in pure Go (crypto/tls), while xKit uses OpenSSL's C implementation. Both servers configure ALPN for h2 and http/1.1.

cd bench && go build -o ../build/bench/go_https_bench https_bench_server.go
./build/bench/go_https_bench 8444 bench_cert.pem bench_key.pem

Routes

Both servers implement identical routes:

RouteMethodDescription
/pingGETReturns "pong" (4 bytes) — minimal response latency test
/echo?size=NGETReturns N bytes of 'x' — variable response size test
/echoPOSTEchoes request body — request body throughput test

Results

HTTPS/1.1 — GET /ping (wrk, varying connections)

Tests HTTPS/1.1 performance where each connection maintains its own TLS session. wrk reuses connections (no per-request handshake), so this measures encrypted request/response throughput.

ConnectionsxKit Req/sGo Req/sxKit LatencyGo LatencyΔ
50125,147125,076395 μs372 μs≈ 0%
100124,593128,2770.86 ms764 μsGo +3%
200122,837127,0751.88 ms1.57 msGo +3%
500111,397122,4985.25 ms4.06 msGo +10%

Analysis:

  • Under HTTPS/1.1, xKit and Go are nearly identical at low connection counts (~125K req/s each). This is a dramatic contrast to plaintext HTTP/1.1 where xKit was +18–24% faster.
  • TLS encryption is the bottleneck, not the HTTP layer. OpenSSL's AES-GCM encryption on a single thread saturates at ~125K req/s regardless of the HTTP framework above it.
  • At 500 connections, Go pulls ahead by ~10% because Go's multi-threaded runtime can parallelize TLS encryption across all CPU cores, while xKit's single-threaded event loop is limited to one core for both TLS and HTTP processing.
  • xKit's latency is slightly higher at high connection counts (5.25 ms vs 4.06 ms at 500 connections) — the single thread must serialize all TLS encrypt/decrypt operations.

HTTPS/2 — GET /ping (h2load, varying connections)

Tests HTTPS/2 performance with TLS + ALPN negotiation. HTTP/2 multiplexing reduces the number of TLS sessions needed, which should benefit the single-threaded xKit.

ConnectionsxKit Req/sGo Req/sxKit LatencyGo LatencyΔ
50511,586165,341975 μs2.99 msxKit +209%
100508,685144,0241.96 ms6.88 msxKit +253%
200497,775131,7494.01 ms15.00 msxKit +278%

Analysis:

  • With HTTPS/2, xKit regains its massive advantage: +209% to +278% over Go. HTTP/2 multiplexing means fewer TLS sessions are needed — multiple streams share one encrypted connection, so the TLS overhead is amortized.
  • xKit achieves ~510K req/s over HTTPS/2 — only ~10% less than its h2c (cleartext HTTP/2) performance of 562K. The TLS overhead is minimal when amortized across multiplexed streams.
  • Go's HTTPS/2 throughput (~131–165K) is comparable to its h2c numbers (~121–142K), suggesting Go's TLS overhead is also well-amortized but the HTTP/2 processing itself is the bottleneck.

HTTPS/2 — GET /echo (h2load, varying response size)

Tests response serialization + TLS encryption throughput with different payload sizes. Fixed at 100 connections.

Response SizexKit Req/sGo Req/sxKit LatencyGo LatencyΔ
64 B470,607146,7272.11 ms6.74 msxKit +221%
1 KiB388,828140,9262.56 ms6.99 msxKit +176%
4 KiB227,414118,5954.38 ms8.22 msxKit +92%

Analysis:

  • xKit's advantage narrows as response size grows (from +221% at 64B to +92% at 4KB) because TLS encryption of larger payloads becomes a bigger fraction of total work.
  • At 4KB responses, xKit still achieves 893 MB/s encrypted throughput vs Go's 466 MB/s.

HTTPS/2 — POST /echo (h2load, varying body size)

Tests request body parsing + TLS decryption/encryption throughput. Fixed at 100 connections.

Body SizexKit Req/sGo Req/sxKit Transfer/sGo Transfer/sΔ
1 KiB291,086146,916289.93 MB/s147.01 MB/sxKit +98%
4 KiB128,229104,892503.54 MB/s413.20 MB/sxKit +22%
16 KiB38,97537,391609.97 MB/s586.70 MB/sxKit +4%
64 KiB10,27814,994643.30 MB/s939.77 MB/sGo +46%

Analysis:

  • At small payloads (1KB), xKit is +98% faster. At medium payloads (4KB), the gap narrows to +22%.
  • At 16KB, the two are nearly tied (+4%). At 64KB, Go wins by +46% — this is the first scenario where Go decisively beats xKit.
  • The 64KB crossover happens because: (1) TLS encryption of 64KB payloads is CPU-intensive and benefits from Go's multi-core parallelism, (2) HTTP/2 flow control window (default 64KB) creates back-pressure that the single-threaded event loop handles less efficiently than Go's goroutine scheduler.

Protocol Comparison

How does TLS affect performance for each protocol? (GET /ping, 100 connections)

ServerHTTP/1.1HTTPS/1.1Δ (TLS cost)
xKit152,316124,593−18%
Go128,915128,277−0.5%
Serverh2cHTTPS/2Δ (TLS cost)
xKit561,825508,685−9%
Go120,732144,024+19%

Key Insights:

  1. TLS costs xKit 18% on HTTP/1.1 because every connection requires its own TLS session, and all encryption runs on a single thread. Go's multi-core TLS is essentially free (−0.5%).
  2. TLS costs xKit only 9% on HTTP/2 because multiplexed streams share TLS sessions. This is why HTTPS/2 is xKit's sweet spot.
  3. Go actually gets faster with HTTPS/2 vs h2c (+19%) — likely because TLS session caching and ALPN negotiation provide a more optimized code path in Go's crypto/tls + x/net/http2 stack.

Summary

                    xKit vs Go HTTPS (Release build, OpenSSL 3.6.1)
                    =================================================

  HTTPS/1.1 (wrk):
    GET /ping:     Go ≈ xKit (−0% to +10% Go advantage at high connections)
    GET /echo 1KB: Go +10%

  HTTPS/2 (h2load -m10):
    GET /ping:     xKit +209% ~ +278%
    GET /echo:     xKit +92%  ~ +221%
    POST /echo:    xKit +98%  (1KB) → Go +46% (64KB)

  Peak throughput:  xKit 512K req/s  (HTTPS/2 GET /ping, 50 connections)
  Peak transfer:    Go 940 MB/s      (HTTPS/2 POST /echo, 64KB body)

Key Takeaways:

  1. HTTPS/1.1 is TLS-bound. Single-threaded OpenSSL encryption caps xKit at ~125K req/s — the same as Go. The HTTP framework advantage disappears when TLS dominates.
  2. HTTPS/2 restores xKit's advantage. Stream multiplexing amortizes TLS overhead across streams, letting xKit's efficient event loop shine again (+209–278% on GET /ping).
  3. Large payloads favor Go. At 64KB POST bodies, Go's multi-core TLS parallelism wins by +46%. This is the only scenario where Go decisively beats xKit.
  4. Choose your protocol wisely. For latency-sensitive APIs with small payloads, HTTPS/2 + xKit is optimal. For bulk data transfer, Go's multi-core TLS is more efficient.

Reproducing

# Build xKit server
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DXK_BUILD_BENCHMARKS=ON
cmake --build build --parallel

# Build Go HTTPS server
cd bench && go build -o ../build/bench/go_https_bench https_bench_server.go && cd ..

# Generate self-signed certificate
openssl req -x509 -newkey rsa:2048 -keyout /tmp/bench_key.pem \
  -out /tmp/bench_cert.pem -days 365 -nodes -subj '/CN=localhost'

# Install tools (macOS)
brew install wrk nghttp2

# Start servers
./build/bench/https_bench_server 8443 /tmp/bench_cert.pem /tmp/bench_key.pem &
./build/bench/go_https_bench 8444 /tmp/bench_cert.pem /tmp/bench_key.pem &

# HTTPS/1.1 benchmark (wrk)
wrk -t4 -c100 -d10s https://127.0.0.1:8443/ping
wrk -t4 -c100 -d10s https://127.0.0.1:8444/ping

# HTTPS/2 benchmark (h2load)
h2load -t4 -c100 -m10 -D 10 https://127.0.0.1:8443/ping
h2load -t4 -c100 -m10 -D 10 https://127.0.0.1:8444/ping

# POST benchmark
dd if=/dev/zero bs=4096 count=1 | tr '\0' 'x' > /tmp/body_4k.bin
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin https://127.0.0.1:8443/echo
h2load -t4 -c100 -m10 -D 10 -d /tmp/body_4k.bin https://127.0.0.1:8444/echo

# Cleanup
pkill -f https_bench_server
pkill -f go_https_bench

TODO

Planning and feasibility analysis for future improvements.

移除 libcurl 依赖的可行性与收益分析

一、当前 libcurl 的使用范围

libcurl 仅被 HTTP Client 部分使用,涉及以下文件:

文件依赖程度说明
client.c核心整个文件围绕 curl_multi / curl_easy 构建
client.hAPI 层xHttpResponse 暴露了 curl_code / curl_error
client_private.h核心CURL *easyCURLM *multiCURLcodeCURL_ERROR_SIZE
sse.c核心SSE 流式传输完全基于 curl write callback
xhttp/CMakeLists.txt构建Libcurl::Libcurl 链接
CMakeLists.txt (顶层)构建整个 xhttp 模块的编译以 Libcurl_FOUND 为前提

不依赖 curl 的部分(占 xhttp 模块的大部分):

  • HTTP Server(server.cproto_h1.cproto_h2.c)→ 用 llhttp + nghttp2
  • WebSocket Server(ws.cws_serve.cws_handshake_server.c
  • WebSocket Client(ws_connect.cws_handshake_client.c)→ 纯 socket + xEventLoop
  • Transport 层(transport_*.c)→ 纯 OpenSSL / mbedTLS
  • WS Frame / Deflate / Crypto

二、libcurl 提供了什么

libcurl 在 xhttp client 中承担了以下职责:

graph TD
    A[libcurl 提供的能力] --> B[HTTP/1.1 协议解析<br/>请求序列化 + 响应解析]
    A --> C[HTTP/2 协议支持<br/>HPACK, 流复用, 帧处理]
    A --> D[TLS 握手管理<br/>证书验证, ALPN 协商]
    A --> E[Multi-Socket API<br/>非阻塞 I/O 集成]
    A --> F[连接池 / Keep-Alive<br/>DNS 缓存]
    A --> G[Chunked Transfer<br/>Content-Encoding 解压]
    A --> H[重定向跟随<br/>Cookie 管理]
    A --> I[代理支持<br/>SOCKS / HTTP proxy]

三、替换方案分析

如果移除 libcurl,需要自己实现 HTTP Client 协议栈

需要自建的组件复杂度说明
HTTP/1.1 请求序列化⭐ 低手动拼 GET /path HTTP/1.1\r\n...
HTTP/1.1 响应解析⭐⭐ 中可复用已有的 llhttp(server 已在用)
Chunked Transfer Decoding⭐⭐ 中llhttp 可处理
TLS 客户端握手⭐⭐ 中WS Client 已有 transport_tls_client_openssl/mbedtls,可复用
HTTP/2 客户端⭐⭐⭐⭐ 高需要 nghttp2 的 client session API(server 已用 nghttp2,但 client 模式不同)
连接池 / Keep-Alive⭐⭐⭐ 高需要自己管理连接复用、idle timeout
Multi-Socket 事件集成⭐⭐ 中已有 xEventLoop,但需要自己管理连接状态机
DNS 异步解析⭐⭐⭐ 高curl 内置 c-ares 集成,自建需要额外依赖或阻塞
重定向 / Cookie / Proxy⭐⭐ 中按需实现

四、收益分析

✅ 收益

  1. 减少外部依赖

    • 当前 xhttp 模块需要 libcurl(~600KB 动态库),移除后减少一个系统级依赖
    • 嵌入式 / 交叉编译场景更友好(libcurl 的交叉编译配置较复杂)
  2. 统一 TLS 管理

    • 目前 HTTP Client 的 TLS 由 curl 内部管理(CURLOPT_CAINFO 等),与 xnet/xhttp 其他部分的 xTlsCtx 体系割裂
    • 移除后可统一使用 xTlsCtx 共享模式,与 TCP/WS Client/HTTP Server 一致
  3. 消除 API 泄漏

    • xHttpResponse 中的 curl_code / curl_error 是 curl 特有概念,暴露给用户不够抽象
    • 移除后可用 xErrno 统一错误体系
  4. 减小二进制体积

    • 对于只用 server 或 WS 的场景,不再需要链接 curl
  5. 更精细的控制

    • 连接池策略、超时行为、buffer 管理等可以完全自定义

❌ 代价

  1. 工作量巨大(估算 2000-3000 行新代码)

    • HTTP/1.1 Client 协议栈:~500 行
    • HTTP/2 Client(nghttp2 client session):~800 行
    • 连接池 + Keep-Alive 管理:~500 行
    • SSE 重新集成:~300 行
    • DNS 解析:~200 行(或引入 c-ares)
    • 测试重写:~500 行
  2. HTTP/2 Client 是最大难点

    • nghttp2 的 client API 与 server API 差异大,需要处理 SETTINGS、WINDOW_UPDATE、流优先级等
    • curl 内部对 nghttp2 client 做了大量边界处理
  3. 失去 curl 的成熟度

    • libcurl 经过 25+ 年打磨,处理了无数 HTTP 边界情况(畸形响应、各种 Transfer-Encoding、代理认证等)
    • 自建实现短期内很难达到同等健壮性
  4. 维护负担增加

    • HTTP 协议的 edge case 很多,自建意味着长期维护成本

五、折中方案

如果目标是减少依赖但不完全重写,有几个渐进路径:

graph LR
    A[当前状态<br/>curl 必选] --> B[方案1: curl 可选<br/>有 curl 用 curl<br/>无 curl 用内置 H1]
    A --> C[方案2: 仅移除 H2 Client<br/>内置 H1 Client<br/>H2 仍用 curl]
    A --> D[方案3: 完全移除<br/>内置 H1 + H2 Client]
    
    B --> E[工作量: ~800行<br/>风险: 低]
    C --> F[工作量: ~600行<br/>风险: 低]
    D --> G[工作量: ~2500行<br/>风险: 高]

推荐方案1:让 curl 变为可选依赖

  • 新增一个轻量的内置 HTTP/1.1 Client(基于已有的 llhttp + transport_tls_client + xEventLoop)
  • 有 curl 时用 curl(支持 H2、连接池等高级特性)
  • 无 curl 时 fallback 到内置 H1 Client(覆盖 80% 的使用场景)
  • HTTP Server、WS Server/Client 完全不受影响(它们本来就不依赖 curl)

这样可以:

  • 让 xhttp 模块在无 curl 环境下也能编译(server + ws + 基础 client)
  • 保留 curl 作为增强选项(H2 client、连接池、代理等)
  • 统一 TLS 管理(内置 client 用 xTlsCtx
  • 逐步迁移,风险可控

六、结论

维度完全移除可选依赖(推荐)
工作量~2500 行 + 测试重写~800 行
风险高(H2 client 复杂)低(H1 only,复用现有组件)
收益零外部依赖无 curl 也能用,有 curl 更强
API 变化需要重新设计 Response可以抽象一层,渐进迁移
时间2-3 周3-5 天

建议:先做方案1(curl 可选),把 HTTP Server / WS 从 curl 依赖中解耦出来(实际上它们已经解耦了,只是 CMake 层面整个 xhttp 模块被 curl 门控了)。然后再根据实际需求决定是否进一步移除 curl。