Skip to content

IR Types

The Intermediate Representation (IR) is the core data model of headerkit. Parser backends produce IR objects; writers consume them to generate output in various formats.

All IR types are Python dataclasses defined in the headerkit.ir module.

Container

The top-level object returned by all parser backends.

Header dataclass

Header(path, declarations=list(), included_headers=set())

Container for a parsed C/C++ header file.

This is the top-level result returned by all parser backends. It contains the file path and all extracted declarations.

::

from headerkit.backends import get_backend
from headerkit.ir import Struct, Function

backend = get_backend()
header = backend.parse(code, "myheader.h")

print(f"Parsed {len(header.declarations)} declarations from {header.path}")

for decl in header.declarations:
    if isinstance(decl, Function):
        print(f"  Function: {decl.name}")

Parameters:

Name Type Description Default
path str

Path to the original header file.

required
declarations list[Declaration]

List of extracted declarations (structs, functions, etc.).

list()
included_headers set[str]

Set of header file basenames included by this header (populated by libclang backend only).

Example

set()

Type Expressions

Type expressions form a recursive tree structure representing C type syntax. For example, const char** becomes Pointer(Pointer(CType("char", ["const"]))).

CType dataclass

CType(name, qualifiers=list())

A C type expression representing a base type with optional qualifiers.

This is the fundamental building block for all type representations. Qualifiers like const, volatile, unsigned are stored separately from the type name for easier manipulation.

Parameters:

Name Type Description Default
name str

The base type name (e.g., "int", "long", "char").

required
qualifiers list[str]

Type qualifiers (e.g., ["const"], ["unsigned"]).

Examples

Simple types::

int_type = CType("int")
unsigned_long = CType("long", ["unsigned"])
const_int = CType("int", ["const"])

Composite types with pointers::

from headerkit.ir import Pointer

# const char*
const_char_ptr = Pointer(CType("char", ["const"]))
list()

Pointer dataclass

Pointer(pointee, qualifiers=list())

Pointer to another type.

Represents pointer types with optional qualifiers. Pointers can be nested to represent multi-level indirection (e.g., char**).

Parameters:

Name Type Description Default
pointee TypeExpr

The type being pointed to.

required
qualifiers list[str]

Qualifiers on the pointer itself (e.g., ["const"] for a const pointer, not a pointer to const).

Examples

Basic pointer::

int_ptr = Pointer(CType("int"))  # int*

Pointer to const::

const_char_ptr = Pointer(CType("char", ["const"]))  # const char*

Double pointer::

char_ptr_ptr = Pointer(Pointer(CType("char")))  # char**

Const pointer (pointer itself is const)::

const_ptr = Pointer(CType("int"), ["const"])  # int* const
list()

Array dataclass

Array(element_type, size=None)

Fixed-size or flexible array type.

Represents C array types, which can have a fixed numeric size, a symbolic size (macro or constant), or be flexible (incomplete).

Parameters:

Name Type Description Default
element_type TypeExpr

The type of array elements.

required
size Union[int, str] | None

Array size - an integer for fixed size, a string for symbolic/expression size (e.g., "MAX_SIZE"), or None for flexible/incomplete arrays.

Examples

Fixed-size array::

int_arr = Array(CType("int"), 10)

Flexible array (incomplete)::

flex_arr = Array(CType("char"), None)

Symbolic size::

buf = Array(CType("char"), "BUFFER_SIZE")

Multi-dimensional array::

matrix = Array(Array(CType("int"), 3), 3)
None

FunctionPointer dataclass

FunctionPointer(return_type, parameters=list(), is_variadic=False, calling_convention=None)

Function pointer type.

Represents a pointer to a function with a specific signature. Used for callbacks, vtables, and function tables.

Parameters:

Name Type Description Default
return_type TypeExpr

The function's return type.

required
parameters list[Parameter]

List of function parameters.

list()
is_variadic bool

True if the function accepts variable arguments (ends with ...).

False
calling_convention str | None

The calling convention if non-default (e.g., "stdcall", "cdecl", "fastcall"). None for the platform default calling convention.

Examples

Simple function pointer::

void_fn = FunctionPointer(CType("int"), [])  # int (*)(void)

With parameters::

callback = FunctionPointer(
    CType("void"),
    [Parameter("data", Pointer(CType("void")))]
)  # void (*)(void* data)

Variadic function pointer::

printf_fn = FunctionPointer(
    CType("int"),
    [Parameter("fmt", Pointer(CType("char", ["const"])))],
    is_variadic=True
)  # int (*)(const char* fmt, ...)
None

Declarations

Declaration types represent the top-level constructs found in C/C++ headers.

Enum dataclass

Enum(name, values=list(), is_typedef=False, location=None)

Enumeration declaration.

Represents a C enum type with named constants. Enums may be named or anonymous (used in typedefs or inline).

Parameters:

Name Type Description Default
name str | None

The enum tag name, or None for anonymous enums.

required
values list[EnumValue]

List of enumeration constants.

list()
is_typedef bool

True if this enum came from a typedef declaration.

False
location SourceLocation | None

Source location for error reporting.

Examples

Named enum::

color = Enum("Color", [
    EnumValue("RED", 0),
    EnumValue("GREEN", 1),
    EnumValue("BLUE", 2),
])

Anonymous enum (typically used with typedef)::

anon = Enum(None, [EnumValue("FLAG_A", 1), EnumValue("FLAG_B", 2)])
None

EnumValue dataclass

EnumValue(name, value=None)

Single enumeration constant.

Represents one named constant within an enum definition.

Parameters:

Name Type Description Default
name str

The constant name.

required
value Union[int, str] | None

The constant's value - an integer for explicit values, a string for expressions (e.g., "FOO | BAR"), or None for auto-incremented values.

Examples

Explicit value::

red = EnumValue("RED", 0)

Auto-increment (implicit value)::

green = EnumValue("GREEN", None)  # follows previous value

Expression value::

mask = EnumValue("MASK", "FLAG_A | FLAG_B")
None

Struct dataclass

Struct(name, fields=list(), methods=list(), is_union=False, is_cppclass=False, is_typedef=False, is_packed=False, namespace=None, template_params=list(), cpp_name=None, notes=list(), inner_typedefs=dict(), location=None)

Struct or union declaration.

Represents a C struct or union type definition. Both use the same IR class with is_union distinguishing between them.

Parameters:

Name Type Description Default
name str | None

The struct/union tag name, or None for anonymous types.

required
fields list[Field]

List of member fields.

list()
methods list[Function]

List of methods (for C++ classes only).

list()
is_union bool

True for unions, False for structs.

False
is_cppclass bool

True for C++ classes (uses cppclass in Cython).

False
is_typedef bool

True if this came from a typedef declaration.

False
is_packed bool

True if the struct has __attribute__((packed)), which disables padding and alignment. Affects memory layout.

False
location SourceLocation | None

Source location for error reporting.

Examples

Simple struct::

point = Struct("Point", [
    Field("x", CType("int")),
    Field("y", CType("int")),
])

Union::

data = Struct("Data", [
    Field("i", CType("int")),
    Field("f", CType("float")),
], is_union=True)

C++ class with method::

widget = Struct("Widget", [
    Field("width", CType("int")),
], methods=[
    Function("resize", CType("void"), [
        Parameter("w", CType("int")),
        Parameter("h", CType("int")),
    ])
], is_cppclass=True)

Anonymous struct::

anon = Struct(None, [Field("value", CType("int"))])
None

Field dataclass

Field(name, type, bit_width=None, anonymous_struct=None)

Struct or union field declaration.

Represents a single field within a struct or union definition.

Parameters:

Name Type Description Default
name str

The field name.

required
type TypeExpr

The field's type expression.

required
bit_width int | None

C bitfield width in bits, or None for non-bitfield fields. For example, uint32_t flags : 4 has bit_width=4.

None
anonymous_struct Struct | None

When this field is an anonymous nested struct or union, holds the :class:Struct IR node for the anonymous type. None for regular fields.

Examples

Simple field::

x_field = Field("x", CType("int"))  # int x

Pointer field::

data = Field("data", Pointer(CType("void")))  # void* data

Array field::

buffer = Field("buffer", Array(CType("char"), 256))  # char buffer[256]

Bitfield::

flags = Field("flags", CType("uint32_t"), bit_width=4)  # uint32_t flags : 4

Anonymous nested struct::

inner = Struct(None, [Field("x", CType("int"))], is_union=False)
field = Field("pos", CType("void"), anonymous_struct=inner)
None

Function dataclass

Function(name, return_type, parameters=list(), is_variadic=False, calling_convention=None, namespace=None, location=None)

Function declaration.

Represents a C function prototype or declaration. Does not include the function body (declarations only).

Parameters:

Name Type Description Default
name str

The function name.

required
return_type TypeExpr

The function's return type.

required
parameters list[Parameter]

List of function parameters.

list()
is_variadic bool

True if the function accepts variable arguments.

False
calling_convention str | None

The calling convention if non-default (e.g., "stdcall", "cdecl", "fastcall"). None for the platform default calling convention.

None
location SourceLocation | None

Source location for error reporting.

Examples

Simple function::

exit_fn = Function("exit", CType("void"), [
    Parameter("status", CType("int"))
])

With return value::

strlen_fn = Function("strlen", CType("size_t"), [
    Parameter("s", Pointer(CType("char", ["const"])))
])

Variadic function::

printf_fn = Function(
    "printf",
    CType("int"),
    [Parameter("fmt", Pointer(CType("char", ["const"])))],
    is_variadic=True
)
None

Parameter dataclass

Parameter(name, type)

Function parameter declaration.

Represents a single parameter in a function signature. Parameters may be named or anonymous (common in prototypes).

Parameters:

Name Type Description Default
name str | None

Parameter name, or None for anonymous parameters.

required
type TypeExpr

The parameter's type expression.

Examples

Named parameter::

x_param = Parameter("x", CType("int"))  # int x

Anonymous parameter::

anon = Parameter(None, Pointer(CType("void")))  # void*

Complex type::

callback = Parameter("fn", FunctionPointer(CType("void"), []))
required

Typedef dataclass

Typedef(name, underlying_type, location=None)

Type alias declaration.

Represents a C typedef that creates an alias for another type. Common patterns include aliasing primitives, struct tags, and function pointer types.

Parameters:

Name Type Description Default
name str

The new type name being defined.

required
underlying_type TypeExpr

The type being aliased.

required
location SourceLocation | None

Source location for error reporting.

Examples

Simple alias::

size_t = Typedef("size_t", CType("long", ["unsigned"]))

Struct typedef::

point_t = Typedef("Point", CType("struct Point"))

Function pointer typedef::

callback_t = Typedef("Callback", FunctionPointer(
    CType("void"),
    [Parameter("data", Pointer(CType("void")))]
))
None

Variable dataclass

Variable(name, type, location=None)

Global variable declaration.

Represents a global or extern variable declaration. Does not include local variables (which are not exposed in header files).

Parameters:

Name Type Description Default
name str

The variable name.

required
type TypeExpr

The variable's type.

required
location SourceLocation | None

Source location for error reporting.

Examples

Extern variable::

errno_var = Variable("errno", CType("int"))

Const string::

version = Variable("version", Pointer(CType("char", ["const"])))

Array variable::

lookup_table = Variable("table", Array(CType("int"), 256))
None

Constant dataclass

Constant(name, value=None, type=None, is_macro=False, location=None)

Compile-time constant declaration.

Represents #define macros with constant values or const variable declarations. Only backends that support macro extraction (e.g., libclang) can populate macro constants.

Parameters:

Name Type Description Default
name str

The constant name.

required
value Union[int, float, str] | None

The constant's value - an integer, float, or string expression. None if the value cannot be determined.

None
type CType | None

For typed constants (const int), the C type. None for macros.

None
is_macro bool

True if this is a #define macro, False for const declarations.

False
location SourceLocation | None

Source location for error reporting.

Examples

Numeric macro::

size = Constant("SIZE", 100, is_macro=True)

Expression macro::

mask = Constant("MASK", "1 << 4", is_macro=True)

Typed const::

max_val = Constant("MAX_VALUE", 255, type=CType("int"))

String macro::

version = Constant("VERSION", '"1.0.0"', is_macro=True)
None

Union Types

These are typing.Union aliases used in type annotations throughout headerkit.

Declaration

Declaration = Union[Enum, Struct, Function, Typedef, Variable, Constant]

Any top-level declaration that can appear in a Header.

TypeExpr

TypeExpr = Union[CType, Pointer, Array, FunctionPointer]

Any type expression that can appear in a declaration's type fields.

Source Location

SourceLocation dataclass

SourceLocation(file, line, column=None)

Location in source file for error reporting and filtering.

Used to track where declarations originated, enabling:

  • Better error messages during parsing
  • Filtering declarations by file (e.g., exclude system headers)
  • Source mapping for debugging

::

loc = SourceLocation("myheader.h", 42, 5)
print(f"Declaration at {loc.file}:{loc.line}")

Parameters:

Name Type Description Default
file str

Path to the source file.

required
line int

Line number (1-indexed).

required
column int | None

Column number (1-indexed), or None if unknown.

Example

None

Parser Backend Protocol

The parser backend protocol is defined alongside the IR types since backends produce IR directly. See also the Backends page for registry functions.

ParserBackend

Bases: Protocol

Protocol defining the interface for parser backends.

All parser backends must implement this protocol to be usable with headerkit. Backends are responsible for translating from their native AST format (pycparser, libclang, etc.) to the common :class:Header IR format.

Available Backends

  • libclang - LLVM clang-based parser with C++ support

Example

::

from headerkit.backends import get_backend

# Get default backend
backend = get_backend()

# Get specific backend
libclang = get_backend("libclang")

# Parse code
header = backend.parse("int foo(void);", "test.h")

name property

name

Human-readable name of this backend (e.g., "pycparser").

supports_macros property

supports_macros

Whether this backend can extract #define constants.

supports_cpp property

supports_cpp

Whether this backend can parse C++ code.

parse

parse(code, filename, include_dirs=None, extra_args=None, *, use_default_includes=True, recursive_includes=True, max_depth=10, project_prefixes=None)

Parse C/C++ code and return the IR representation.

Parameters:

Name Type Description Default
code str

Source code to parse.

required
filename str

Name of the source file. Used for error messages and #line directives. Does not need to exist on disk.

required
include_dirs list[str] | None

Directories to search for #include files. Only used by backends that handle preprocessing.

None
extra_args list[str] | None

Additional arguments for the preprocessor/compiler. Format is backend-specific.

None
use_default_includes bool

If True, add system include directories.

True
recursive_includes bool

If True, detect umbrella headers and recursively parse included project headers.

True
max_depth int

Maximum recursion depth for include processing.

10
project_prefixes tuple[str, ...] | None

Path prefixes to treat as project headers.

None

Returns:

Type Description
Header

Parsed header containing all extracted declarations.

Raises:

Type Description
RuntimeError

If parsing fails due to syntax errors.