map.h — Generic Key-Value Map

Introduction

map.h provides a generic associative container that stores opaque key-value pairs and supports multiple backend implementations selected at creation time. Users supply a hash function and an equality function; the map handles collision resolution, resizing, and iteration internally. Three backends are available: separate-chaining hash table, open-addressing hash table, and red-black tree.

Design Philosophy

  1. vtable-Driven Polymorphism — All backends share a common xMapVTable dispatch table. The public API (xMapSet, xMapGet, xMapDel, etc.) forwards calls through function pointers, so callers can switch backends by changing a single xMapType argument without touching any other code.

  2. Opaque Keys and Values — The map stores const void * keys and void * values. Hash and equality functions are user-supplied, making the map reusable for any key type (strings, integers, structs) without code generation or macros.

  3. Single-Allocation Construction — The hash and flat backends allocate the struct header and the initial bucket/slot array in one contiguous calloc call. This reduces allocation overhead and improves cache locality for small maps.

  4. No Key/Value Ownership — The map does not own the keys or values it stores. xMapDestroy() frees internal structures but NOT user data. This gives the caller full control over element lifecycle.

  5. Built-in Hash Helpers — Common hash/equality pairs for C strings (xMapStrHash / xMapStrEq) and integer keys (xMapIntHash / xMapIntEq) are provided out of the box, covering the two most frequent use cases.

Architecture

graph TD
    CREATE["xMapCreate(type, cap, hash, eq)"]
    HASH["xMapType_Hash<br/>Separate Chaining"]
    FLAT["xMapType_Flat<br/>Open Addressing"]
    TREE["xMapType_Tree<br/>Red-Black Tree"]

    CREATE -->|"type = Hash"| HASH
    CREATE -->|"type = Flat"| FLAT
    CREATE -->|"type = Tree"| TREE

    API["Public API<br/>Set / Get / Del / Len / Iterate"]

    HASH --> VT["xMapVTable dispatch"]
    FLAT --> VT
    TREE --> VT
    VT --> API

    style CREATE fill:#4a90d9,color:#fff
    style HASH fill:#f5a623,color:#fff
    style FLAT fill:#50b86c,color:#fff
    style TREE fill:#e74c3c,color:#fff
    style API fill:#4a90d9,color:#fff

Internal Dispatch

graph LR
    subgraph "xMapBase (common header)"
        VTABLE["vtable *"]
        HASHFN["hash()"]
        EQFN["eq()"]
    end

    subgraph "xMapVTable"
        SET["set()"]
        GET["get()"]
        DEL["del()"]
        LEN["len()"]
        ITER["iterate()"]
        DESTROY["destroy()"]
    end

    VTABLE --> SET
    VTABLE --> GET
    VTABLE --> DEL
    VTABLE --> LEN
    VTABLE --> ITER
    VTABLE --> DESTROY

Every backend struct embeds xMapBase as its first member. The public API casts the opaque xMap handle to xMapBase * to access the vtable, then dispatches to the backend-specific implementation.

Backend Implementations

Hash (Separate Chaining)

┌─────────────────────────────────────────┐
│ xMapHash (single calloc)               │
│   base: { vtable, hash, eq }           │
│   buckets → ┌──┬──┬──┬──┬──┬──┐       │
│             │  │  │  │  │  │  │ ...    │
│             └──┴──┴──┴──┴──┴──┘       │
│   size, cap                             │
└─────────────────────────────────────────┘
         │
         ▼
   ┌─────────┐    ┌─────────┐
   │ Entry   │───▶│ Entry   │───▶ NULL
   │ key,val │    │ key,val │
   └─────────┘    └─────────┘
  • Collision resolution: Linked list per bucket.
  • Load factor threshold: 75% — triggers 2× resize with full rehash.
  • Memory layout: Initial buckets are allocated inline (contiguous with the struct). After the first resize, buckets are a separate allocation.
  • Best for: General-purpose use, pointer-heavy keys, high collision tolerance.

Flat (Open Addressing, Linear Probing)

┌─────────────────────────────────────────┐
│ xMapFlat (single calloc)               │
│   base: { vtable, hash, eq }           │
│   slots → ┌───────┬───────┬───────┐    │
│           │ key   │ key   │ EMPTY │... │
│           │ val   │ val   │       │    │
│           │ OCCUP │ OCCUP │       │    │
│           └───────┴───────┴───────┘    │
│   size, cap                             │
└─────────────────────────────────────────┘
  • Collision resolution: Linear probing with tombstone markers for deletion.
  • Load factor threshold: 70% — triggers 2× resize (tombstones are discarded during rehash).
  • Slot states: EMPTY (never used), OCCUPIED (active entry), TOMBSTONE (deleted, probe continues).
  • Memory layout: Initial slots are allocated inline. After the first resize, slots are a separate allocation.
  • Best for: Small keys (integers, pointers), cache-friendly sequential access, iteration-heavy workloads.

Tree (Red-Black Tree)

         ┌───────────┐
         │  node(B)  │
         │ hash=500  │
         ├─────┬─────┤
         │     │     │
    ┌────▼──┐ ┌▼────────┐
    │node(R)│ │ node(R)  │
    │hash=200│ │ hash=800 │
    └───────┘ └──────────┘
  • Ordering: Nodes are ordered by 64-bit hash value.
  • Hash collisions: When two different keys produce the same hash, the first key is stored in the tree node's primary slot; additional keys are chained in a singly-linked overflow list (xTreeOverflow).
  • Deletion optimization: When deleting a primary key that has overflow entries, the first overflow entry is promoted to primary — avoiding an expensive RB-tree fixup.
  • No pre-allocation: The cap parameter is ignored; nodes are allocated individually on insert.
  • Best for: Ordered iteration by hash value, worst-case O(log n) guarantees, workloads where hash table resizing pauses are unacceptable.

Operations and Complexity

OperationHash (avg)Hash (worst)Flat (avg)Flat (worst)Tree
xMapSetO(1)O(n)O(1)O(n)O(log n)
xMapGetO(1)O(n)O(1)O(n)O(log n)
xMapDelO(1)O(n)O(1)O(n)O(log n)
xMapLenO(1)O(1)O(1)O(1)O(1)
xMapIterateO(n + cap)O(n + cap)O(cap)O(cap)O(n)
xMapCreateO(cap)O(cap)O(cap)O(cap)O(1)
xMapDestroyO(n + cap)O(n + cap)O(cap)O(cap)O(n)

Note: For Hash, iteration visits all buckets (including empty ones). For Flat, iteration visits all slots. Tree iteration is a pure in-order traversal visiting only occupied nodes.

API Reference

Types

TypeDescription
xMapTypeEnum: xMapType_Hash (separate chaining), xMapType_Flat (open addressing), xMapType_Tree (red-black tree)
xMapOpaque handle to a map
xMapHashFuncuint64_t (*)(const void *key) — Returns a 64-bit hash for the given key
xMapEqFuncbool (*)(const void *a, const void *b) — Returns true if two keys are equal
xMapIterFuncbool (*)(const void *key, void *val, void *arg) — Iterator callback; return false to stop early

Functions

FunctionSignatureDescriptionThread Safety
xMapCreatexMap xMapCreate(xMapType type, size_t cap, xMapHashFunc hash, xMapEqFunc eq)Create a map with the specified backend. cap = 0 uses default (16). hash and eq are required.Not thread-safe
xMapDestroyvoid xMapDestroy(xMap m)Free the map. Does NOT free user keys/values. NULL is a safe no-op.Not thread-safe
xMapSetxErrno xMapSet(xMap m, const void *key, void *val)Insert or update a key-value pair. Returns xErrno_Ok or xErrno_NoMemory.Not thread-safe
xMapGetvoid *xMapGet(xMap m, const void *key)Look up a value by key. Returns NULL if not found.Not thread-safe
xMapDelvoid *xMapDel(xMap m, const void *key)Remove a key-value pair. Returns the removed value, or NULL.Not thread-safe
xMapLensize_t xMapLen(xMap m)Return the number of entries. O(1).Not thread-safe
xMapIteratevoid xMapIterate(xMap m, xMapIterFunc fn, void *arg)Iterate over all entries. Callback returns false to stop early.Not thread-safe

Built-in Hash / Equality Helpers

FunctionDescription
xMapStrHashFNV-1a 64-bit hash for NUL-terminated C strings
xMapStrEqstrcmp-based equality for C strings
xMapIntHashSplitmix64 finalizer for integer keys cast to (void *)
xMapIntEqPointer-value equality for integer keys cast to (void *)

Usage Examples

String-Keyed Map

#include <stdio.h>
#include <xbase/map.h>

int main(void) {
    xMap m = xMapCreate(xMapType_Hash, 0, xMapStrHash, xMapStrEq);

    xMapSet(m, "alice", (void *)"engineer");
    xMapSet(m, "bob",   (void *)"designer");
    xMapSet(m, "carol", (void *)"manager");

    printf("alice = %s\n", (const char *)xMapGet(m, "alice"));
    printf("bob   = %s\n", (const char *)xMapGet(m, "bob"));

    // Update existing key
    xMapSet(m, "alice", (void *)"senior engineer");
    printf("alice = %s\n", (const char *)xMapGet(m, "alice"));

    // Delete
    xMapDel(m, "bob");
    printf("bob   = %s\n", xMapGet(m, "bob") ? "found" : "not found");
    printf("len   = %zu\n", xMapLen(m));

    xMapDestroy(m);
    return 0;
}

Integer-Keyed Map with Iteration

#include <stdio.h>
#include <xbase/map.h>

static bool print_entry(const void *key, void *val, void *arg) {
    (void)arg;
    printf("  key=%ld val=%ld\n", (long)(intptr_t)key, (long)(intptr_t)val);
    return true; // continue iteration
}

int main(void) {
    // Use flat map for cache-friendly integer lookups
    xMap m = xMapCreate(xMapType_Flat, 0, xMapIntHash, xMapIntEq);

    for (int i = 1; i <= 10; i++) {
        xMapSet(m, (const void *)(intptr_t)i,
                   (void *)(intptr_t)(i * i));
    }

    printf("Entries (%zu):\n", xMapLen(m));
    xMapIterate(m, print_entry, NULL);

    xMapDestroy(m);
    return 0;
}

Choosing a Backend

#include <xbase/map.h>

void example(void) {
    // General purpose — good default
    xMap hash_map = xMapCreate(xMapType_Hash, 0, xMapStrHash, xMapStrEq);

    // Cache-friendly for small integer keys
    xMap flat_map = xMapCreate(xMapType_Flat, 0, xMapIntHash, xMapIntEq);

    // Ordered iteration, O(log n) worst-case guarantees
    xMap tree_map = xMapCreate(xMapType_Tree, 0, xMapStrHash, xMapStrEq);

    // ... use them identically via xMapSet/xMapGet/xMapDel ...

    xMapDestroy(hash_map);
    xMapDestroy(flat_map);
    xMapDestroy(tree_map);
}

How to Choose a Backend

CriteriaHashFlatTree
Average lookupO(1) ✅O(1) ✅O(log n)
Worst-case lookupO(n)O(n)O(log n) ✅
Cache localityPoor (pointer chasing)Excellent ✅Poor (pointer chasing)
Iteration speedVisits empty bucketsVisits empty slotsVisits only entries ✅
Ordered iterationNoNoYes (by hash) ✅
Resize pausesYes (rehash)Yes (rehash)No ✅
Memory overheadEntry nodes + bucket arraySlot array (inline) ✅Node + parent/child pointers
DeletionFree entry nodeTombstone markerRB fixup or overflow promotion
Best forGeneral purposeSmall keys, hot loopsOrdered access, latency-sensitive

Rule of thumb: Start with xMapType_Hash. Switch to xMapType_Flat if profiling shows cache misses dominate. Use xMapType_Tree when you need ordered iteration or cannot tolerate resize pauses.

Use Cases

  1. Session Management — Store active sessions keyed by session ID (string). The hash backend provides O(1) average lookup for connection dispatch.

  2. Configuration Registry — Map string keys to configuration values. The tree backend provides ordered iteration for serialization.

  3. Object Caches — Cache computed results keyed by integer IDs. The flat backend's cache-friendly layout minimizes latency for hot-path lookups.

  4. Symbol Tables — Compilers and interpreters can use the map to store variable bindings, with string keys and pointer values.

Best Practices

  • Always provide both hash and eq. The map requires both functions; passing NULL for either causes xMapCreate to return NULL.
  • Use the built-in helpers when possible. xMapStrHash/xMapStrEq and xMapIntHash/xMapIntEq are well-tested and optimized.
  • Keys must remain valid while stored. The map stores key pointers, not copies. If you free a key while it's in the map, lookups will read freed memory.
  • Don't modify keys in-place. Changing a key's content after insertion will corrupt the map's internal structure (wrong bucket/slot/tree position).
  • Pre-size when the count is known. Pass a cap hint to xMapCreate to avoid early resizes. For hash and flat backends, capacity should be a power of 2.
  • Prefer xMapType_Hash as the default. It handles the widest range of workloads well. Only switch backends based on profiling data.

Comparison with Other Libraries

Featurexbase map.hC++ std::unordered_mapGo mapGLib GHashTableuthash
LanguageC99C++GoCC (macros)
Key Typevoid * (generic)TemplatecomparablegpointerStruct field
Multiple BackendsHash / Flat / Tree ✅Hash onlyHash onlyHash onlyHash only
Ordered IterationTree backend ✅No (std::map for ordered)NoNoNo
OwnershipNo (caller owns)Yes (copies)Yes (copies)NoNo
Thread SafetyNot thread-safeNot thread-safeNot thread-safeNot thread-safeNot thread-safe
Resize Strategy2× with rehashBucket-based rehashIncrementalBucket-based rehashBucket-based rehash
IntrusiveNoNoNoNoYes (struct embedding)

Key Differentiator: xbase's map provides three interchangeable backends behind a single API. Callers can tune the data structure to their workload (cache locality, ordered access, worst-case guarantees) without changing any code beyond the xMapType argument.

Benchmark

Environment: Apple Mac15,7 (12 cores), 36 GB RAM, macOS 26.x, Release build (-O2). Each result is the median of 3 repetitions (--benchmark_min_time=0.5s --benchmark_repetitions=3). Source: xbase/map_bench.cpp

The hash and tree backends allocate nodes through xSlab (see slab.md); the flat backend uses a single contiguous array and does no per-entry allocation.

Set (Insert)

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Map_Set_Hash644,8794,87913.1 M items/s
BM_Map_Set_Hash5129,0279,02756.7 M items/s
BM_Map_Set_Hash4,09656,78156,77972.1 M items/s
BM_Map_Set_Hash32,768713,860713,81045.9 M items/s
BM_Map_Set_Flat641,0611,06260.2 M items/s
BM_Map_Set_Flat5125,5075,50893.0 M items/s
BM_Map_Set_Flat4,09648,03348,03685.3 M items/s
BM_Map_Set_Flat32,768689,267689,27547.5 M items/s
BM_Map_Set_Tree645,2655,26812.1 M items/s
BM_Map_Set_Tree51211,23211,23345.6 M items/s
BM_Map_Set_Tree4,096146,120146,12028.0 M items/s
BM_Map_Set_Tree32,7683,154,7283,154,59810.4 M items/s

Get (Lookup)

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Map_Get_Hash64214214298.7 M items/s
BM_Map_Get_Hash5121,9671,967260.3 M items/s
BM_Map_Get_Hash4,09620,19220,187202.9 M items/s
BM_Map_Get_Hash32,768207,804207,791157.7 M items/s
BM_Map_Get_Flat64243243263.8 M items/s
BM_Map_Get_Flat5122,2762,276224.9 M items/s
BM_Map_Get_Flat4,09622,25822,256184.0 M items/s
BM_Map_Get_Flat32,768256,893256,885127.6 M items/s
BM_Map_Get_Tree64438438146.1 M items/s
BM_Map_Get_Tree5124,8294,829106.0 M items/s
BM_Map_Get_Tree4,09660,68760,68767.5 M items/s
BM_Map_Get_Tree32,7682,600,9102,600,79212.6 M items/s

Del (Delete)

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Map_Del_Hash641,2471,25051.2 M items/s
BM_Map_Del_Hash5123,3663,371151.9 M items/s
BM_Map_Del_Hash4,09623,81823,814172.0 M items/s
BM_Map_Del_Hash32,768209,060209,018156.8 M items/s
BM_Map_Del_Flat641,1531,15555.4 M items/s
BM_Map_Del_Flat5123,0263,030169.0 M items/s
BM_Map_Del_Flat4,09621,23621,243192.8 M items/s
BM_Map_Del_Flat32,768270,593268,020122.3 M items/s
BM_Map_Del_Tree641,7881,79135.7 M items/s
BM_Map_Del_Tree5128,5248,52760.0 M items/s
BM_Map_Del_Tree4,096146,494145,90728.1 M items/s
BM_Map_Del_Tree32,7682,672,1922,672,15512.3 M items/s

Iterate

BenchmarkNTime (ns)CPU (ns)Throughput
BM_Map_Iterate_Hash64128128500.2 M items/s
BM_Map_Iterate_Hash5121,0301,030497.3 M items/s
BM_Map_Iterate_Hash4,0968,4368,436485.5 M items/s
BM_Map_Iterate_Hash32,768169,785169,780193.0 M items/s
BM_Map_Iterate_Flat64120120534.7 M items/s
BM_Map_Iterate_Flat512973973526.0 M items/s
BM_Map_Iterate_Flat4,0967,7757,774526.9 M items/s
BM_Map_Iterate_Flat32,768113,315113,308289.2 M items/s
BM_Map_Iterate_Tree64154154416.7 M items/s
BM_Map_Iterate_Tree5121,2351,235414.4 M items/s
BM_Map_Iterate_Tree4,09610,81310,812378.8 M items/s
BM_Map_Iterate_Tree32,768178,903178,901183.2 M items/s

Key Observations:

  • Flat is fastest for small maps. At N≤512, flat's contiguous array layout beats hash on both insert and iterate, and trades evenly with hash on lookup/delete. It is the right choice when capacity fits in a few cache lines.
  • Hash scales better at large N. At N=32K, hash sustains 157.7 M lookups/s vs flat's 127.6 M and tree's 12.6 M — separate-chaining avoids the probe-length blowup that hurts flat as load increases.
  • Tree pays for ordering. At N=32K, tree set throughput is 10.4 M items/s (~30× slower than flat). Pick tree only when range scans or predictable worst-case latency matter; its iterate throughput remains strong at small N because the red-black walk stays cache-resident.
  • Iteration dominates everywhere. Flat peaks at ~535 M items/s (pure sequential scan), hash ~500 M (bucket hop + chain), tree ~415 M (in-order recursion). Use iterate for bulk scans rather than repeatedly calling xMapGet.
  • Large-N drops are real. Both flat and hash lose roughly a third of peak throughput between 4K and 32K entries — this is the L2-to-L3 cache boundary, not an algorithmic issue.