32. Abstract Syntax Trees

Python 2.5

32. Abstract Syntax Trees

New in version 2.5.

The _ast module helps Python applications to process trees of the Python abstract syntax grammar. The Python compiler currently provides read-only access to such trees, meaning that applications can only create a tree for a given piece of Python source code; generating byte code from a (potentially modified) tree is not supported. The abstract syntax itself might change with each Python release; this module helps to find out programmatically what the current grammar looks like.

An abstract syntax tree can be generated by passing _ast.PyCF_ONLY_AST as a flag to the compile builtin function. The result will be a tree of objects whose classes all inherit from _ast.AST.

The actual classes are derived from the Parser/Python.asdl file, which is reproduced below. There is one class defined for each left-hand side symbol in the abstract grammar (for example, _ast.stmt or _ast.expr). In addition, there is one class defined for each constructor on the right-hand side; these classes inherit from the classes for the left-hand side trees. For example, _ast.BinOp inherits from _ast.expr. For production rules with alternatives (aka "sums"), the left-hand side class is abstract: only instances of specific constructor nodes are ever created.

Each concrete class has an attribute _fields which gives the names of all child nodes.

Each instance of a concrete class has one attribute for each child node, of the type as defined in the grammar. For example, _ast.BinOp instances have an attribute left of type _ast.expr. Instances of _ast.expr and _ast.stmt subclasses also have lineno and col_offset attributes. The lineno is the line number of source text (1 indexed so the first line is line 1) and the col_offset is the utf8 byte offset of the first token that generated the node. The utf8 offset is recorded because the parser uses utf8 internally.

If these attributes are marked as optional in the grammar (using a question mark), the value might be None. If the attributes can have zero-or-more values (marked with an asterisk), the values are represented as Python lists.


See About this document... for information on suggesting changes.