IR Types¶
The Intermediate Representation (IR) is the core data model of headerkit. Parser backends produce IR objects; writers consume them to generate output in various formats.
All IR types are Python dataclasses defined in the headerkit.ir module.
Container¶
The top-level object returned by all parser backends.
Header
dataclass
¶
Container for a parsed C/C++ header file.
This is the top-level result returned by all parser backends. It contains the file path and all extracted declarations.
::
from headerkit.backends import get_backend
from headerkit.ir import Struct, Function
backend = get_backend()
header = backend.parse(code, "myheader.h")
print(f"Parsed {len(header.declarations)} declarations from {header.path}")
for decl in header.declarations:
if isinstance(decl, Function):
print(f" Function: {decl.name}")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the original header file. |
required |
declarations
|
list[Declaration]
|
List of extracted declarations (structs, functions, etc.). |
list()
|
included_headers
|
set[str]
|
Set of header file basenames included by this header (populated by libclang backend only). Example¶ |
set()
|
Type Expressions¶
Type expressions form a recursive tree structure representing C type syntax.
For example, const char** becomes Pointer(Pointer(CType("char", ["const"]))).
CType
dataclass
¶
A C type expression representing a base type with optional qualifiers.
This is the fundamental building block for all type representations.
Qualifiers like const, volatile, unsigned are stored separately
from the type name for easier manipulation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The base type name (e.g., |
required |
qualifiers
|
list[str]
|
Type qualifiers (e.g., Examples¶Simple types:: Composite types with pointers:: |
list()
|
Pointer
dataclass
¶
Pointer to another type.
Represents pointer types with optional qualifiers. Pointers can be
nested to represent multi-level indirection (e.g., char**).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pointee
|
TypeExpr
|
The type being pointed to. |
required |
qualifiers
|
list[str]
|
Qualifiers on the pointer itself (e.g., Examples¶Basic pointer:: Pointer to const:: Double pointer:: Const pointer (pointer itself is const):: |
list()
|
Array
dataclass
¶
Fixed-size or flexible array type.
Represents C array types, which can have a fixed numeric size, a symbolic size (macro or constant), or be flexible (incomplete).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
element_type
|
TypeExpr
|
The type of array elements. |
required |
size
|
Union[int, str] | None
|
Array size - an integer for fixed size, a string for
symbolic/expression size (e.g., Examples¶Fixed-size array:: Flexible array (incomplete):: Symbolic size:: Multi-dimensional array:: |
None
|
FunctionPointer
dataclass
¶
Function pointer type.
Represents a pointer to a function with a specific signature. Used for callbacks, vtables, and function tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
return_type
|
TypeExpr
|
The function's return type. |
required |
parameters
|
list[Parameter]
|
List of function parameters. |
list()
|
is_variadic
|
bool
|
True if the function accepts variable arguments
(ends with |
False
|
calling_convention
|
str | None
|
The calling convention if non-default
(e.g., Examples¶Simple function pointer:: With parameters:: Variadic function pointer:: |
None
|
Declarations¶
Declaration types represent the top-level constructs found in C/C++ headers.
Enum
dataclass
¶
Enumeration declaration.
Represents a C enum type with named constants. Enums may be named or anonymous (used in typedefs or inline).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str | None
|
The enum tag name, or None for anonymous enums. |
required |
values
|
list[EnumValue]
|
List of enumeration constants. |
list()
|
is_typedef
|
bool
|
True if this enum came from a typedef declaration. |
False
|
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Named enum:: Anonymous enum (typically used with typedef):: |
None
|
EnumValue
dataclass
¶
Single enumeration constant.
Represents one named constant within an enum definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The constant name. |
required |
value
|
Union[int, str] | None
|
The constant's value - an integer for explicit values,
a string for expressions (e.g., Examples¶Explicit value:: Auto-increment (implicit value):: Expression value:: |
None
|
Struct
dataclass
¶
Struct(name, fields=list(), methods=list(), is_union=False, is_cppclass=False, is_typedef=False, is_packed=False, namespace=None, template_params=list(), cpp_name=None, notes=list(), inner_typedefs=dict(), location=None)
Struct or union declaration.
Represents a C struct or union type definition. Both use the same
IR class with is_union distinguishing between them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str | None
|
The struct/union tag name, or None for anonymous types. |
required |
fields
|
list[Field]
|
List of member fields. |
list()
|
methods
|
list[Function]
|
List of methods (for C++ classes only). |
list()
|
is_union
|
bool
|
True for unions, False for structs. |
False
|
is_cppclass
|
bool
|
True for C++ classes (uses |
False
|
is_typedef
|
bool
|
True if this came from a typedef declaration. |
False
|
is_packed
|
bool
|
True if the struct has |
False
|
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Simple struct:: Union:: C++ class with method:: Anonymous struct:: |
None
|
Field
dataclass
¶
Struct or union field declaration.
Represents a single field within a struct or union definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The field name. |
required |
type
|
TypeExpr
|
The field's type expression. |
required |
bit_width
|
int | None
|
C bitfield width in bits, or None for non-bitfield
fields. For example, |
None
|
anonymous_struct
|
Struct | None
|
When this field is an anonymous nested
struct or union, holds the :class: Examples¶Simple field:: Pointer field:: Array field:: Bitfield:: Anonymous nested struct:: |
None
|
Function
dataclass
¶
Function(name, return_type, parameters=list(), is_variadic=False, calling_convention=None, namespace=None, location=None)
Function declaration.
Represents a C function prototype or declaration. Does not include the function body (declarations only).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The function name. |
required |
return_type
|
TypeExpr
|
The function's return type. |
required |
parameters
|
list[Parameter]
|
List of function parameters. |
list()
|
is_variadic
|
bool
|
True if the function accepts variable arguments. |
False
|
calling_convention
|
str | None
|
The calling convention if non-default
(e.g., |
None
|
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Simple function:: With return value:: Variadic function:: |
None
|
Parameter
dataclass
¶
Function parameter declaration.
Represents a single parameter in a function signature. Parameters may be named or anonymous (common in prototypes).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str | None
|
Parameter name, or None for anonymous parameters. |
required |
type
|
TypeExpr
|
The parameter's type expression. Examples¶Named parameter:: Anonymous parameter:: Complex type:: |
required |
Typedef
dataclass
¶
Type alias declaration.
Represents a C typedef that creates an alias for another type. Common patterns include aliasing primitives, struct tags, and function pointer types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The new type name being defined. |
required |
underlying_type
|
TypeExpr
|
The type being aliased. |
required |
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Simple alias:: Struct typedef:: Function pointer typedef:: |
None
|
Variable
dataclass
¶
Global variable declaration.
Represents a global or extern variable declaration. Does not include local variables (which are not exposed in header files).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The variable name. |
required |
type
|
TypeExpr
|
The variable's type. |
required |
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Extern variable:: Const string:: Array variable:: |
None
|
Constant
dataclass
¶
Compile-time constant declaration.
Represents #define macros with constant values or const
variable declarations. Only backends that support macro extraction
(e.g., libclang) can populate macro constants.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The constant name. |
required |
value
|
Union[int, float, str] | None
|
The constant's value - an integer, float, or string expression. None if the value cannot be determined. |
None
|
type
|
CType | None
|
For typed constants ( |
None
|
is_macro
|
bool
|
True if this is a |
False
|
location
|
SourceLocation | None
|
Source location for error reporting. Examples¶Numeric macro:: Expression macro:: Typed const:: String macro:: |
None
|
Union Types¶
These are typing.Union aliases used in type annotations throughout headerkit.
Declaration¶
Any top-level declaration that can appear in a Header.
TypeExpr¶
Any type expression that can appear in a declaration's type fields.
Source Location¶
SourceLocation
dataclass
¶
Location in source file for error reporting and filtering.
Used to track where declarations originated, enabling:
- Better error messages during parsing
- Filtering declarations by file (e.g., exclude system headers)
- Source mapping for debugging
::
loc = SourceLocation("myheader.h", 42, 5)
print(f"Declaration at {loc.file}:{loc.line}")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
str
|
Path to the source file. |
required |
line
|
int
|
Line number (1-indexed). |
required |
column
|
int | None
|
Column number (1-indexed), or None if unknown. Example¶ |
None
|
Parser Backend Protocol¶
The parser backend protocol is defined alongside the IR types since backends produce IR directly. See also the Backends page for registry functions.
ParserBackend
¶
Bases: Protocol
Protocol defining the interface for parser backends.
All parser backends must implement this protocol to be usable with headerkit.
Backends are responsible for translating from their native AST format
(pycparser, libclang, etc.) to the common :class:Header IR format.
Available Backends¶
libclang- LLVM clang-based parser with C++ support
Example¶
::
from headerkit.backends import get_backend
# Get default backend
backend = get_backend()
# Get specific backend
libclang = get_backend("libclang")
# Parse code
header = backend.parse("int foo(void);", "test.h")
parse
¶
parse(code, filename, include_dirs=None, extra_args=None, *, use_default_includes=True, recursive_includes=True, max_depth=10, project_prefixes=None)
Parse C/C++ code and return the IR representation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
code
|
str
|
Source code to parse. |
required |
filename
|
str
|
Name of the source file. Used for error messages
and |
required |
include_dirs
|
list[str] | None
|
Directories to search for |
None
|
extra_args
|
list[str] | None
|
Additional arguments for the preprocessor/compiler. Format is backend-specific. |
None
|
use_default_includes
|
bool
|
If True, add system include directories. |
True
|
recursive_includes
|
bool
|
If True, detect umbrella headers and recursively parse included project headers. |
True
|
max_depth
|
int
|
Maximum recursion depth for include processing. |
10
|
project_prefixes
|
tuple[str, ...] | None
|
Path prefixes to treat as project headers. |
None
|
Returns:
| Type | Description |
|---|---|
Header
|
Parsed header containing all extracted declarations. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If parsing fails due to syntax errors. |