|
1 .. _ast: |
|
2 |
|
3 Abstract Syntax Trees |
|
4 ===================== |
|
5 |
|
6 .. module:: ast |
|
7 :synopsis: Abstract Syntax Tree classes and manipulation. |
|
8 |
|
9 .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> |
|
10 .. sectionauthor:: Georg Brandl <georg@python.org> |
|
11 |
|
12 .. versionadded:: 2.5 |
|
13 The low-level ``_ast`` module containing only the node classes. |
|
14 |
|
15 .. versionadded:: 2.6 |
|
16 The high-level ``ast`` module containing all helpers. |
|
17 |
|
18 |
|
19 The :mod:`ast` module helps Python applications to process trees of the Python |
|
20 abstract syntax grammar. The abstract syntax itself might change with each |
|
21 Python release; this module helps to find out programmatically what the current |
|
22 grammar looks like. |
|
23 |
|
24 An abstract syntax tree can be generated by passing :data:`_ast.PyCF_ONLY_AST` |
|
25 as a flag to the :func:`compile` builtin function, or using the :func:`parse` |
|
26 helper provided in this module. The result will be a tree of objects whose |
|
27 classes all inherit from :class:`ast.AST`. |
|
28 |
|
29 A modified abstract syntax tree can be compiled into a Python code object using |
|
30 the built-in :func:`compile` function. |
|
31 |
|
32 Node classes |
|
33 ------------ |
|
34 |
|
35 .. class:: AST |
|
36 |
|
37 This is the base of all AST node classes. The actual node classes are |
|
38 derived from the :file:`Parser/Python.asdl` file, which is reproduced |
|
39 :ref:`below <abstract-grammar>`. They are defined in the :mod:`_ast` C |
|
40 module and re-exported in :mod:`ast`. |
|
41 |
|
42 There is one class defined for each left-hand side symbol in the abstract |
|
43 grammar (for example, :class:`ast.stmt` or :class:`ast.expr`). In addition, |
|
44 there is one class defined for each constructor on the right-hand side; these |
|
45 classes inherit from the classes for the left-hand side trees. For example, |
|
46 :class:`ast.BinOp` inherits from :class:`ast.expr`. For production rules |
|
47 with alternatives (aka "sums"), the left-hand side class is abstract: only |
|
48 instances of specific constructor nodes are ever created. |
|
49 |
|
50 .. attribute:: _fields |
|
51 |
|
52 Each concrete class has an attribute :attr:`_fields` which gives the names |
|
53 of all child nodes. |
|
54 |
|
55 Each instance of a concrete class has one attribute for each child node, |
|
56 of the type as defined in the grammar. For example, :class:`ast.BinOp` |
|
57 instances have an attribute :attr:`left` of type :class:`ast.expr`. |
|
58 |
|
59 If these attributes are marked as optional in the grammar (using a |
|
60 question mark), the value might be ``None``. If the attributes can have |
|
61 zero-or-more values (marked with an asterisk), the values are represented |
|
62 as Python lists. All possible attributes must be present and have valid |
|
63 values when compiling an AST with :func:`compile`. |
|
64 |
|
65 .. attribute:: lineno |
|
66 col_offset |
|
67 |
|
68 Instances of :class:`ast.expr` and :class:`ast.stmt` subclasses have |
|
69 :attr:`lineno` and :attr:`col_offset` attributes. The :attr:`lineno` is |
|
70 the line number of source text (1-indexed so the first line is line 1) and |
|
71 the :attr:`col_offset` is the UTF-8 byte offset of the first token that |
|
72 generated the node. The UTF-8 offset is recorded because the parser uses |
|
73 UTF-8 internally. |
|
74 |
|
75 The constructor of a class :class:`ast.T` parses its arguments as follows: |
|
76 |
|
77 * If there are positional arguments, there must be as many as there are items |
|
78 in :attr:`T._fields`; they will be assigned as attributes of these names. |
|
79 * If there are keyword arguments, they will set the attributes of the same |
|
80 names to the given values. |
|
81 |
|
82 For example, to create and populate an :class:`ast.UnaryOp` node, you could |
|
83 use :: |
|
84 |
|
85 node = ast.UnaryOp() |
|
86 node.op = ast.USub() |
|
87 node.operand = ast.Num() |
|
88 node.operand.n = 5 |
|
89 node.operand.lineno = 0 |
|
90 node.operand.col_offset = 0 |
|
91 node.lineno = 0 |
|
92 node.col_offset = 0 |
|
93 |
|
94 or the more compact :: |
|
95 |
|
96 node = ast.UnaryOp(ast.USub(), ast.Num(5, lineno=0, col_offset=0), |
|
97 lineno=0, col_offset=0) |
|
98 |
|
99 .. versionadded:: 2.6 |
|
100 The constructor as explained above was added. In Python 2.5 nodes had |
|
101 to be created by calling the class constructor without arguments and |
|
102 setting the attributes afterwards. |
|
103 |
|
104 |
|
105 .. _abstract-grammar: |
|
106 |
|
107 Abstract Grammar |
|
108 ---------------- |
|
109 |
|
110 The module defines a string constant ``__version__`` which is the decimal |
|
111 Subversion revision number of the file shown below. |
|
112 |
|
113 The abstract grammar is currently defined as follows: |
|
114 |
|
115 .. literalinclude:: ../../Parser/Python.asdl |
|
116 |
|
117 |
|
118 :mod:`ast` Helpers |
|
119 ------------------ |
|
120 |
|
121 .. versionadded:: 2.6 |
|
122 |
|
123 Apart from the node classes, :mod:`ast` module defines these utility functions |
|
124 and classes for traversing abstract syntax trees: |
|
125 |
|
126 .. function:: parse(expr, filename='<unknown>', mode='exec') |
|
127 |
|
128 Parse an expression into an AST node. Equivalent to ``compile(expr, |
|
129 filename, mode, PyCF_ONLY_AST)``. |
|
130 |
|
131 |
|
132 .. function:: literal_eval(node_or_string) |
|
133 |
|
134 Safely evaluate an expression node or a string containing a Python |
|
135 expression. The string or node provided may only consist of the following |
|
136 Python literal structures: strings, numbers, tuples, lists, dicts, booleans, |
|
137 and ``None``. |
|
138 |
|
139 This can be used for safely evaluating strings containing Python expressions |
|
140 from untrusted sources without the need to parse the values oneself. |
|
141 |
|
142 |
|
143 .. function:: get_docstring(node, clean=True) |
|
144 |
|
145 Return the docstring of the given *node* (which must be a |
|
146 :class:`FunctionDef`, :class:`ClassDef` or :class:`Module` node), or ``None`` |
|
147 if it has no docstring. If *clean* is true, clean up the docstring's |
|
148 indentation with :func:`inspect.cleandoc`. |
|
149 |
|
150 |
|
151 .. function:: fix_missing_locations(node) |
|
152 |
|
153 When you compile a node tree with :func:`compile`, the compiler expects |
|
154 :attr:`lineno` and :attr:`col_offset` attributes for every node that supports |
|
155 them. This is rather tedious to fill in for generated nodes, so this helper |
|
156 adds these attributes recursively where not already set, by setting them to |
|
157 the values of the parent node. It works recursively starting at *node*. |
|
158 |
|
159 |
|
160 .. function:: increment_lineno(node, n=1) |
|
161 |
|
162 Increment the line number of each node in the tree starting at *node* by *n*. |
|
163 This is useful to "move code" to a different location in a file. |
|
164 |
|
165 |
|
166 .. function:: copy_location(new_node, old_node) |
|
167 |
|
168 Copy source location (:attr:`lineno` and :attr:`col_offset`) from *old_node* |
|
169 to *new_node* if possible, and return *new_node*. |
|
170 |
|
171 |
|
172 .. function:: iter_fields(node) |
|
173 |
|
174 Yield a tuple of ``(fieldname, value)`` for each field in ``node._fields`` |
|
175 that is present on *node*. |
|
176 |
|
177 |
|
178 .. function:: iter_child_nodes(node) |
|
179 |
|
180 Yield all direct child nodes of *node*, that is, all fields that are nodes |
|
181 and all items of fields that are lists of nodes. |
|
182 |
|
183 |
|
184 .. function:: walk(node) |
|
185 |
|
186 Recursively yield all child nodes of *node*, in no specified order. This is |
|
187 useful if you only want to modify nodes in place and don't care about the |
|
188 context. |
|
189 |
|
190 |
|
191 .. class:: NodeVisitor() |
|
192 |
|
193 A node visitor base class that walks the abstract syntax tree and calls a |
|
194 visitor function for every node found. This function may return a value |
|
195 which is forwarded by the `visit` method. |
|
196 |
|
197 This class is meant to be subclassed, with the subclass adding visitor |
|
198 methods. |
|
199 |
|
200 .. method:: visit(node) |
|
201 |
|
202 Visit a node. The default implementation calls the method called |
|
203 :samp:`self.visit_{classname}` where *classname* is the name of the node |
|
204 class, or :meth:`generic_visit` if that method doesn't exist. |
|
205 |
|
206 .. method:: generic_visit(node) |
|
207 |
|
208 This visitor calls :meth:`visit` on all children of the node. |
|
209 |
|
210 Note that child nodes of nodes that have a custom visitor method won't be |
|
211 visited unless the visitor calls :meth:`generic_visit` or visits them |
|
212 itself. |
|
213 |
|
214 Don't use the :class:`NodeVisitor` if you want to apply changes to nodes |
|
215 during traversal. For this a special visitor exists |
|
216 (:class:`NodeTransformer`) that allows modifications. |
|
217 |
|
218 |
|
219 .. class:: NodeTransformer() |
|
220 |
|
221 A :class:`NodeVisitor` subclass that walks the abstract syntax tree and |
|
222 allows modification of nodes. |
|
223 |
|
224 The `NodeTransformer` will walk the AST and use the return value of the |
|
225 visitor methods to replace or remove the old node. If the return value of |
|
226 the visitor method is ``None``, the node will be removed from its location, |
|
227 otherwise it is replaced with the return value. The return value may be the |
|
228 original node in which case no replacement takes place. |
|
229 |
|
230 Here is an example transformer that rewrites all occurrences of name lookups |
|
231 (``foo``) to ``data['foo']``:: |
|
232 |
|
233 class RewriteName(NodeTransformer): |
|
234 |
|
235 def visit_Name(self, node): |
|
236 return copy_location(Subscript( |
|
237 value=Name(id='data', ctx=Load()), |
|
238 slice=Index(value=Str(s=node.id)), |
|
239 ctx=node.ctx |
|
240 ), node) |
|
241 |
|
242 Keep in mind that if the node you're operating on has child nodes you must |
|
243 either transform the child nodes yourself or call the :meth:`generic_visit` |
|
244 method for the node first. |
|
245 |
|
246 For nodes that were part of a collection of statements (that applies to all |
|
247 statement nodes), the visitor may also return a list of nodes rather than |
|
248 just a single node. |
|
249 |
|
250 Usually you use the transformer like this:: |
|
251 |
|
252 node = YourTransformer().visit(node) |
|
253 |
|
254 |
|
255 .. function:: dump(node, annotate_fields=True, include_attributes=False) |
|
256 |
|
257 Return a formatted dump of the tree in *node*. This is mainly useful for |
|
258 debugging purposes. The returned string will show the names and the values |
|
259 for fields. This makes the code impossible to evaluate, so if evaluation is |
|
260 wanted *annotate_fields* must be set to False. Attributes such as line |
|
261 numbers and column offsets are not dumped by default. If this is wanted, |
|
262 *include_attributes* can be set to ``True``. |