7.5.3. Accessing the IR¶
You can access the information stored in the IR by using the object oriented IR API. It can be used after importing the module bauhaus.ir (see Listing Importing the ir part of bauhaus).
from bauhaus import ir
There are two basic types in the object oriented IR API: Graph and Node. Both are implemented using Python classes and have methods through which you can access the information stored in the IR. Additionally there are three further types that will be explained as well: SLocs, Tokens and Bitfields.
7.5.3.1. Graph¶
Obtaining a Graph¶
class Graph(arg)
Constructor of a Python IR Graph object. If arg is a string then the filename given by the string is assumed to be an IR file and loaded. An exception is raised if the file is not a valid IR file. If arg is another IR Graph object then a flat copy of the IR graph is generated. In both cases the constructor returns an IR Graph object.
Access to the Graph’s Nodes¶
The root method returns an IR Node object representing the root node of the part given by its argument irpart.
The instance method nodes_of_type takes as input a type object representing
the IR class and yields a set containing IR Node
objects of all IR nodes of the given class, including its subclasses, from within the
Graph.
classes can be a single type object or a union of type objects, e.g., ir.Logical.Type | ir.Logical.Routine.
(For backwards compatibility, classes can also be an arbitrary iterable container of type objects.)
The optional keyword-only argument predicate, a boolean function
that takes a Node as argument, can be used as a filter: only nodes for which predicate
returns True are yielded.
Note: The nodes_of_type method has an additional legacy overload that takes as input an integer IR part and a string classname, that can also be an iterable of strings of class names belonging to the given part.
Similarly all_nodes returns all nodes of the specified IR part which satisfy the given predicate.
The method get_node returns a specific IR node by its number.
The method no_node returns a NoNode of the given node_class, see NoNodes (null references).
Global Properties of a Graph¶
Basepath
The field Basepath returns the basepath of the graph.
Fields Version and Revision return version information for the IR file.
These fields provide information about the source language of the analyzed project. In mixed C and C++ setups, this will return C++.
Is_Library
The field Is_Library returns whether the IR file was linked as library.
Example Usage¶
The following file loads an IR file, accesses the root node of the physical part of the IR, and retrieves all nodes of class type “Namespace” from the logical part of the IR:
from bauhaus import ir
my_graph = ir.Graph("my_file.ir")
the_root = my_graph.root(ir.Physical)
namespace_nodes = my_graph.nodes_of_type(ir.Logical.Namespace)
7.5.3.2. Node¶
Information on a Node¶
By starting from the root IR node or by querying IR nodes of a specific type from the IR graph you can examine and analyze the contents of an IR graph. The IR node class has the following methods:
The instance method type returns the string representation of the IR class type of the node.
The method part returns the IR part containing the node, the method number returns the number of the node.
The instance method is_of_type is a predicate that tests whether the node is of the
given type (or a subtype thereof). The argument typename can be a string that
specifies the class type to check or it can be an iterable of strings.
For new code we encourage to use a python instance check instead, e.g.,
isinstance(node, ir.Logical.Namespace) [1].
The instance method graph returns the graph the node belongs to.
Accessing Fields¶
The instance method fields returns a list containing the names of all fields of the node.
The instance method field accesses a specific field of an IR node and returns the field’s content.
The field name is supplied as fieldname and the return type of the method depends on the field type. For node references you get an IR node object, for boolean fields you get a boolean, for integer fields you get an integer, for list fields you get a python tuple of IR node objects, for set fields you get a frozenset of IR node objects, for SLoc’s you get a SLoc object, and for Token you get a Token object. In all other cases you get a string representation of the field value. If the return value is None, then the field of the IR node was not set.
Fields can also be directly accessed as python attributes: you can access a field fieldname of a node n by simply writing n.fieldname.
As an example, to output the names of all routines in an IR graph you could use the following code:
# Copyright (C) 2025 Axivion GmbH
# Copyright (C) 2025 The Qt Company GmbH, a subsidiary of The Qt Group
# https://www.qt.io
from bauhaus import ir
my_graph = ir.Graph('my_file.ir')
for node in my_graph.nodes_of_type(ir.Logical.Routine):
value = node.Name
if value:
print(value)
The available fields are listed (by class) in chapters Logical and Physical of IR Interfaces and Classes. Additionally, the following computed fields are available for nodes in ir.Logical:
Declarations
Returns the non-defining declaration nodes in ir.Physical, if any, corresponding to the node.
Definitions
Returns the defining declaration nodes in ir.Physical, if any, corresponding to the node.
Decls_And_Defs
Returns all declaration nodes in ir.Physical, if any, corresponding to the node.
Full_Name
Returns the fully-qualified name of the node.
For nodes in ir.Physical there are the following additional, computed fields:
Returns an instance of type SLoc (cf. SLoc) which provides information about the source position of a node. Only nodes implementing interface End_Information provide the field End_SLoc which provides information about the end of the source range covered by the node.
Token
Activates the scanner to provide the token at the position mentioned in field SLoc.
Artificial
Returns True for compiler-generated nodes (e.g., implicit conversions).
Parenthesized
Returns True for expressions enclosed in parentheses.
These fields are only available for nodes implementing interface Operation_Container (see IR Interfaces and Classes). They provide access to the immediately preceding / following node in the same container (e.g., for a statement, the previous / next statement).
For nodes in both ir.Logical and ir.Physical there is a field Name to access a symbol’s name where applicable. The name can be empty, and an empty string will also be returned for nodes that do not represent named things (e.g., expressions).
Checking Field Types¶
The predicate is_simple returns True if the given field stores exactly one value, is_list respectively is_set return True if the field stores a list respectively a set of values.
A field f of a node n can also store references to other nodes: they represent edges from n to the child node(s) stored in f. The predicate is_syntactic evaluates to True if the edges given by the field are syntactic, and is_semantic evaluates to True if they are semantic. If the edges connect nodes in different parts of the IR (i. e. they are cross edges), the predicate is_cross returns True. The predicate is_attribute returns True if the given field stores a value that is not a node reference.
Accessing the node’s neighborhood¶
Besides the class-specific fields to navigate through the IR a node also provides the following generic methods:
Generators over nodes immediately reachable by a syntactic or semantic edge, respectively. neighbors() returns a generator over all these direct neighbors.
is_descendant_of tests whether the node is a transitive child under the given other node. descendants_of_type returns a generator over transitive child nodes of the given type which also fulfill the predicate, if given. The argument classes can be a type object or an iterable of type objects. In the latter case a transitive child is included if it has one of these types (and fulfills the predicate, if given). In addition, classes can also be a simple string or an iterable of strings [1].
enclosing_of_type(classes[, stop_at])
Walks up the abstract syntax tree and returns the first node of the given type(s). If a node of some type mentioned as stop_at is found first, the search stops and returns None. classes has the same signature as in descendants_of_type [1].
enclosing_file()
For a node in part Physical, returns the enclosing file node if present.
in_ast_before(other)
For a node in part Physical, uses prepared AST number for earlier/later check.
nesting_path()
For a node in part Physical, returns the nesting path for the block nesting, consisting of enclosing Statement_Sequence and File_Contents nodes.
skip_typedefs([skip_qualifiers])
If this is a type node in part Logical, look through typedefs and optionally through qualifiers as well.
bottom_type([recursive])
If this is a type node in part Logical, skips typedefs, qualifiers, pointers and arrays. If argument recursive is True, goes down at arrays and pointers repeatedly.
If this is a node in part Physical, skips downwards/upwards in the tree structure while the node is one implementing Decorator_Interface or a Qualified_Type_Definition.
NoNodes (null references)¶
Some methods that return nodes and some fields that point to other nodes may
return a NoNode, i.e., a special kind of typed null reference. For example,
calling node.enclosing_of_type(ir.Logical.Function) will return a node of
type Logical.Function, if node is enclosed by a function.
Otherwise a NoNode[Logical.Function] is returned.
A NoNode has a significant advantage over python’s builtin None: its
non-attribute fields are defined and return NoNode (or empty containers) of
appropriate node class themselves. This allows chained field access without the
need for intermediate null-reference checking:
def fancy_check(node: ir.Logical.Composite_Type):
for member in node.Base_Template.Static_Fields:
...
If the composite type has no base template, node.Base_Template returns NoNode[Composite_Type],
which allows allows accessing the Static_Fields field, returning an empty NodeTuple([]).
Thus no additional if is required to check if node.Base_Template points to a valid node.
The generic type ir.NoNode can be used in type annotations and
ir.OptionalNode[ir.Logical.Some_Node_Class] is a shorthand for
ir.Logical.Some_Node_Class | ir.NoNode[ir.Logical.Some_Node_Class].
For a node: ir.Node, bool(node) is False if and only if node is a NoNode, otherwise it is True.
Compilation units containing the node¶
enclosing_units()
For a node in part Physical, returns an iterator of nodes in part Physical for units containing the node.
before(other)
For a node in part Physical, returns True if there is a unit in which this node comes before other.
before_in_unit(other, unit)
For a node in part Physical, returns True if this node comes before other in the given compilation unit.
Special predefined Python methods¶
The instance method __bool__ returns True if and only if the IR node is not a null reference.
Instance methods __eq__ and __ne__ check two nodes for equality and inequality, respectively. Two nodes are equal whenever the node they represent in the IR graph is the very same node.
Instance methods __str__ and __repr__ output a readable representation and an internal representation of the node. While you may use then for debugging and pure output purposes, do not rely on their exact format as it may change in the future.
Instance method __hash__ returns a perfect hash value for the represented IR node. This is mainly used to be able to store IR nodes in sets and dicts.
Example Usage¶
The following example uses field and isinstance for getting the definition that encloses a given IR node (e. g., it yields the function a statement is part of):
# Copyright (C) 2025 Axivion GmbH
# Copyright (C) 2025 The Qt Company GmbH, a subsidiary of The Qt Group
# https://www.qt.io
from bauhaus import ir
def enclosing_definition(
node: ir.Logical.Logical_IR_Root,
) -> ir.OptionalNode[ir.Logical.Routine]:
parent = node.Parent
if not parent:
return node.graph().no_node(ir.Logical.Routine)
if isinstance(parent, ir.Logical.Routine):
return parent
return enclosing_definition(parent)
my_graph = ir.Graph('my_file.ir')
nodes = my_graph.all_nodes(ir.Logical)
for n in nodes:
print('Definition of %r: %r' % (n, enclosing_definition(n)))
In order to test whether an if-statement has an else-part you can write:
# Copyright (C) 2025 Axivion GmbH
# Copyright (C) 2025 The Qt Company GmbH, a subsidiary of The Qt Group
# https://www.qt.io
from bauhaus import ir
my_graph = ir.Graph('my_file.ir')
for node in my_graph.nodes_of_type(ir.Physical.If_Statement):
if not node.Else_Branch:
print('%s: if without else' % node.SLoc)
else:
print('%s: if with else' % node.SLoc)
7.5.3.3. SLoc¶
Each node in the physical part of the IR has a source location (called “SLoc”) associated with it. If the Node has a source correspondence in the source code, then the source location contains the filename, line and column number of the original file. If the node is the result of a macro invocation, then the relevant macro invocation tree is accessible from the “SLoc” as well. If the Node has no source correspondence because it is a compiler-generated artefact, then the SLoc object will evaluate to False and string representations will be “<NoSLoc>”.
SLocs offer a lot of methods that also Node offers; this allows to process both nodes and SLocs in a uniform way.
Information on a SLoc¶
Accessing Fields¶
SLocs offer the following semantic fields: File, Contents, Macro_Invocation, and Parent, and the following attribute fields: Full_Name, Used_Name, Line, Column, and Token.
The access to SLoc fields is exactly the same as for IR nodes:
The instance method fields returns a list containing the names of all fields of the SLoc.
The instance method field accesses a specific field of a SLoc and returns the field’s content. SLoc fields also can be directly accessed as python attributes by writing n.fieldname.
Checking Field Types¶
Those instance methods are the same as for an IR node, however the methods is_list, is_set, and is_cross always return False for a SLoc.
Special predefined Python methods¶
Basically those instance methods are the same as for an IR node as well, however in addition there are further rich comparison functions defined than for a node. Those are used to give SLocs a lexicographical ordering (which nodes do not have).
7.5.3.4. Token¶
In addition to the direct interface to the IR, there can arise situations where the IR does not model source code close enough, so relying on the IR nodes does nut suffice; one has to inspect the tokens of the source file directly.
Each non-artificial Node or SLoc has a Token associated with it if the source file of the respective Node or SLoc can be found, otherwise the Token is None.
For more information regarding the class Token please refer to Scanner Scripting.
7.5.3.5. Bitfields¶
The Bitfield class is used to facilitate the access to bitfield attributes. They represent a data structure with a fixed set of named boolean flags. Bitfields have a type which specifies the number and names of flags that they store, like Link_Flags or Type_Qualifiers.
Similar to SLocs, Bitfields have a lot of methods in common with the class Node to allow handling all three kinds of objects in a uniform way.
Information on a Bitfield¶
The instance method is_of_type is a predicate that tests whether the bitfield is of the given type. The argument typename can be a string or an iterable of strings.
Accessing Fields¶
Additionally to the field Parent, which stores the IR node which has the bitfield as an attribute, the flags of the bitfield can be accessed by using their name as field names.
The instance method fields returns a list containing the names of all fields of the bitfield, namely Parent and the names of all flags. In contrast to class names and other field names in the IR scripting, the flags are written in lower-case.
The instance method field accesses a specific field of a bitfield and returns the value of the flag as boolean. Again you can access a field fieldname of a node n by simply writing n.fieldname.
Checking Field Types¶
Those instance methods are the same as for an IR node, however the methods is_list, is_set, is_syntactic, and is_cross always return False for a Bitfield.
Special predefined Python methods¶
Those instance methods are the same as for an IR node as well. Please note however that equality on bitfields is not defined by the values of the flags but by identity of the bitfield itself.
For an overview of all Bitfield types used in the IR see IR Interfaces and Classes.
7.5.3.6. Enumerations¶
Another kind of field attribute types used in the IR are enumerations. Each enumeration E represents a finite set e_1, … , e_n of possible values for fields of type E. These values are given as predefined constants in the IR scripting API. For example there are different calling conventions for routines in C and C++, like stdcall and cdecl. Therefore there exists a field Calling_Convention in the class Routine from the logical part of the IR, which can take one of the following values of the enumeration type Calling_Convention_Kind in the IR:
ir.calling_default,
ir.calling_cdecl,
ir.calling_fastcall,
ir.calling_stdcall,
ir.calling_thiscall
For an overview of all enumerations and their corresponding constants see IR Interfaces and Classes.
7.5.3.7. Convenience Methods¶
For facilitating the creation of rules and analyses, the IR API offers additional member methods for the classes Graph, Node, SLoc, Bitfield, and Token than those documented here. A detailed description can be obtained by using the Python online help; just enter one of the following commands:
help(ir.Graph)
help(ir.Node)
help(ir.SLoc)
help(ir.Bitfield)
help(scanner.Token)
in rfgscript after importing the module ir or scanner.