Skip to content

Build Backend Guide

headerkit ships a PEP 517 build backend that generates bindings automatically during pip install or python -m build. When the .headerkit/ directory is committed to version control, the build works without libclang installed on the target machine.

Overview

The build backend wraps an inner backend (hatchling by default). Before the inner backend packages your project, headerkit reads [tool.headerkit] from pyproject.toml, runs generate_all() for every header listed in [tool.headerkit.headers], and writes the output files. The inner backend then includes those files in the wheel or sdist as usual.

graph LR
    A[pip install / python -m build] --> B[headerkit build backend]
    B --> C[generate_all from cache]
    C --> D[inner backend builds wheel/sdist]

Quick start

1. Add headerkit to your build requirements

[build-system]
requires = ["headerkit", "hatchling"]
build-backend = "headerkit.build_backend"

2. Configure headers and writers

[tool.headerkit]
backend = "libclang"
writers = ["cffi"]

[tool.headerkit.headers."include/mylib.h"]
defines = ["VERSION=2"]
include_dirs = ["/usr/local/include"]

3. Populate the cache

Run headerkit locally on a machine with libclang installed:

headerkit include/mylib.h -w cffi -o cffi:bindings/mylib.cdef.txt

This writes cache entries to .headerkit/.

4. Commit the cache

git add .headerkit/
git commit -m "cache: add headerkit cache"

5. Consumers install without libclang

Anyone who clones your repo (or installs from PyPI) gets bindings generated from cache:

pip install .          # reads from .headerkit/, no libclang needed
python -m build        # same for sdist/wheel builds

Full consumer pyproject.toml example

[build-system]
requires = ["headerkit", "hatchling"]
build-backend = "headerkit.build_backend"

[project]
name = "mylib-bindings"
version = "1.0.0"
requires-python = ">=3.10"

[tool.headerkit]
backend = "libclang"
writers = ["cffi", "ctypes"]

[tool.headerkit.headers."include/mylib.h"]
defines = ["VERSION=2"]
include_dirs = ["/usr/local/include"]

[tool.headerkit.headers."include/mylib_utils.h"]
defines = ["VERSION=2", "UTILS_ONLY"]

[tool.headerkit]
store_dir = ".headerkit"

Cross-compilation example

To target a specific architecture explicitly:

[build-system]
requires = ["headerkit", "hatchling"]
build-backend = "headerkit.build_backend"

[project]
name = "mylib-bindings"
version = "1.0.0"
requires-python = ">=3.10"

[tool.headerkit]
backend = "libclang"
writers = ["cffi"]
target = "aarch64-unknown-linux-gnu"

[tool.headerkit.headers."include/mylib.h"]
defines = ["VERSION=2"]
include_dirs = ["/usr/local/include"]

How it works

When pip or build invokes build_wheel() or build_sdist():

  1. headerkit imports build_backend as the PEP 517 backend.
  2. _run_generation() reads [tool.headerkit] from pyproject.toml.
  3. For each entry in [tool.headerkit.headers], it calls generate_all() with the configured backend, writers, defines, and include dirs.
  4. generate_all() checks .headerkit/ first. On a cache hit, it deserializes the stored IR and output without libclang. On a cache miss, it falls back to parsing with libclang.
  5. After generation completes, the inner backend (hatchling by default) runs its normal build_wheel() or build_sdist(), packaging the generated files into the distribution.

Cross-compilation

By default, headerkit auto-detects the target from the Python build itself (HOST_GNU_TYPE on POSIX, sysconfig.get_platform() on Windows). This is inherently correct for native builds and cibuildwheel (which uses emulation or per-arch Python downloads).

For explicit cross-compilation, set target in [tool.headerkit]:

[tool.headerkit]
backend = "libclang"
writers = ["cffi"]
target = "aarch64-unknown-linux-gnu"

Or set the HEADERKIT_TARGET environment variable in CI:

HEADERKIT_TARGET=aarch64-unknown-linux-gnu pip install .

When using cibuildwheel, auto-detection works without extra configuration. cibuildwheel runs each arch's build with the matching Python interpreter, so HOST_GNU_TYPE already reflects the correct target.

Configuration reference

Build system table

Key Description
build-backend Set to "headerkit.build_backend"
requires Must include "headerkit" and the inner backend (e.g., "hatchling")

[tool.headerkit] keys

Key Type Default Description
backend string "libclang" Parser backend name
writers list of strings all registered Writers to run for each header
include_dirs list of strings [] Global include directories applied to all headers
defines list of strings [] Global preprocessor defines applied to all headers
target string auto-detect LLVM target triple for cross-compilation (e.g., aarch64-unknown-linux-gnu)
store_dir string ".headerkit" Directory for cache storage

[tool.headerkit.headers."path/to/header.h"] keys

Key Type Default Description
defines list of strings [] Per-header defines (merged with global defines)
include_dirs list of strings [] Per-header include dirs (merged with global include dirs)

[tool.headerkit.cache] keys

Key Type Default Description
no_cache bool false Disable all caching
no_ir_cache bool false Disable IR cache only
no_output_cache bool false Disable output cache only

config_settings keys

Pass these via pip install --config-settings or python -m build -C:

Key Description
inner-backend Override the inner backend module (default: hatchling.build)
no-cache Set to "true" to disable all caching for this build
no-ir-cache Set to "true" to disable IR cache for this build
no-output-cache Set to "true" to disable output cache for this build
target Override target triple for this build (e.g., aarch64-unknown-linux-gnu)

Example:

pip install . --config-settings="no-cache=true"
pip install . --config-settings="inner-backend=flit_core.buildapi"

Overriding the inner backend

By default headerkit delegates to hatchling.build. To use a different inner backend:

  1. Add the inner backend to requires in [build-system].
  2. Pass inner-backend via config_settings:
python -m build -C inner-backend=flit_core.buildapi

Or set it permanently by adding both to your build requires:

[build-system]
requires = ["headerkit", "flit_core"]
build-backend = "headerkit.build_backend"

Cache miss behavior

When a header's cache entry is missing and libclang is not installed:

  • Wheel builds (build_wheel, build_editable): the build fails with a LibclangUnavailableError. Install libclang and re-run headerkit to populate the cache, then commit .headerkit/.
  • Sdist builds (build_sdist): generation failures are logged as warnings and the build continues. This allows sdist creation on machines without libclang, as long as the sdist consumer has libclang or a populated cache.

To fix a cache miss:

# Install libclang
headerkit install-libclang

# Re-generate and populate cache
headerkit include/mylib.h -w cffi -o cffi:bindings/mylib.cdef.txt

# Commit updated cache
git add .headerkit/
git commit -m "cache: update headerkit cache"

Multiple headers

Configure multiple headers with per-header defines and include dirs:

[tool.headerkit]
backend = "libclang"
writers = ["cffi", "ctypes"]
defines = ["SHARED_DEFINE"]
include_dirs = ["include/common"]

[tool.headerkit.headers."include/core.h"]
defines = ["CORE_API"]
include_dirs = ["include/core"]

[tool.headerkit.headers."include/utils.h"]
defines = ["UTILS_API", "DEBUG"]
include_dirs = ["include/utils"]

[tool.headerkit.headers."include/platform.h"]
# Uses only global defines and include_dirs

Each header's defines and include dirs are merged with the global values. In this example, include/core.h is parsed with defines ["SHARED_DEFINE", "CORE_API"] and include dirs ["include/common", "include/core"].

Populating the cache for all platforms

When your project targets multiple platforms via cibuildwheel, the cache needs entries for each target. The cache populate command generates these using Docker:

# Auto-detect platforms from cibuildwheel config
headerkit cache populate include/mylib.h -w cffi --cibuildwheel

# Or specify platforms explicitly
headerkit cache populate include/mylib.h -w cffi \
    --platform linux/amd64 --platform linux/arm64

The recommended workflow:

  1. Run headerkit cache populate --cibuildwheel on a machine with Docker.
  2. Docker generates cache entries for all target Linux platforms.
  3. Commit .headerkit/ to version control.
  4. CI builds use the cache; no libclang needed on any platform.

For macOS and Windows targets, run headerkit cache populate natively on those platforms or use platform-specific CI jobs to generate their cache entries.

See the Cache Strategy Guide for Docker setup, configuration, and limitations.

Troubleshooting

Stale cache

If bindings are outdated after modifying a header:

# Clear the cache and regenerate
headerkit cache clear --store-dir .headerkit
headerkit include/mylib.h -w cffi -o cffi:bindings/mylib.cdef.txt
git add .headerkit/
git commit -m "cache: regenerate after header changes"

The cache key includes header content, defines, and include dirs. If any of these change, the old entry becomes a miss and a new entry is created. Old entries remain until explicitly cleared.

libclang not found

ImportError: libclang not found

Install libclang with headerkit install-libclang or see the installation guide for platform-specific instructions.

Wrong inner backend

ModuleNotFoundError: No module named 'hatchling'

Add the inner backend to requires in [build-system]:

[build-system]
requires = ["headerkit", "hatchling"]

Build fails but cache exists

Verify the cache is committed and present in the build environment:

ls .headerkit/ir/
ls .headerkit/output/

If files are missing, the .headerkit/ directory may not be included in the sdist. Check your inner backend's include/exclude configuration to ensure .headerkit/ is packaged.

CI validation

Verify the committed cache matches current sources:

headerkit include/mylib.h -w cffi -o cffi:bindings/mylib.cdef.txt
git diff --exit-code .headerkit/ bindings/

A non-empty diff means the cache is stale and must be regenerated.