symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/tokenize.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 
       
     2 :mod:`tokenize` --- Tokenizer for Python source
       
     3 ===============================================
       
     4 
       
     5 .. module:: tokenize
       
     6    :synopsis: Lexical scanner for Python source code.
       
     7 .. moduleauthor:: Ka Ping Yee
       
     8 .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
       
     9 
       
    10 
       
    11 The :mod:`tokenize` module provides a lexical scanner for Python source code,
       
    12 implemented in Python.  The scanner in this module returns comments as tokens as
       
    13 well, making it useful for implementing "pretty-printers," including colorizers
       
    14 for on-screen displays.
       
    15 
       
    16 The primary entry point is a :term:`generator`:
       
    17 
       
    18 .. function:: generate_tokens(readline)
       
    19 
       
    20    The :func:`generate_tokens` generator requires one argument, *readline*,
       
    21    which must be a callable object which provides the same interface as the
       
    22    :meth:`readline` method of built-in file objects (see section
       
    23    :ref:`bltin-file-objects`).  Each call to the function should return one line
       
    24    of input as a string.
       
    25 
       
    26    The generator produces 5-tuples with these members: the token type; the token
       
    27    string; a 2-tuple ``(srow, scol)`` of ints specifying the row and column
       
    28    where the token begins in the source; a 2-tuple ``(erow, ecol)`` of ints
       
    29    specifying the row and column where the token ends in the source; and the
       
    30    line on which the token was found.  The line passed (the last tuple item) is
       
    31    the *logical* line; continuation lines are included.
       
    32 
       
    33    .. versionadded:: 2.2
       
    34 
       
    35 An older entry point is retained for backward compatibility:
       
    36 
       
    37 
       
    38 .. function:: tokenize(readline[, tokeneater])
       
    39 
       
    40    The :func:`tokenize` function accepts two parameters: one representing the input
       
    41    stream, and one providing an output mechanism for :func:`tokenize`.
       
    42 
       
    43    The first parameter, *readline*, must be a callable object which provides the
       
    44    same interface as the :meth:`readline` method of built-in file objects (see
       
    45    section :ref:`bltin-file-objects`).  Each call to the function should return one
       
    46    line of input as a string. Alternately, *readline* may be a callable object that
       
    47    signals completion by raising :exc:`StopIteration`.
       
    48 
       
    49    .. versionchanged:: 2.5
       
    50       Added :exc:`StopIteration` support.
       
    51 
       
    52    The second parameter, *tokeneater*, must also be a callable object.  It is
       
    53    called once for each token, with five arguments, corresponding to the tuples
       
    54    generated by :func:`generate_tokens`.
       
    55 
       
    56 All constants from the :mod:`token` module are also exported from
       
    57 :mod:`tokenize`, as are two additional token type values that might be passed to
       
    58 the *tokeneater* function by :func:`tokenize`:
       
    59 
       
    60 
       
    61 .. data:: COMMENT
       
    62 
       
    63    Token value used to indicate a comment.
       
    64 
       
    65 
       
    66 .. data:: NL
       
    67 
       
    68    Token value used to indicate a non-terminating newline.  The NEWLINE token
       
    69    indicates the end of a logical line of Python code; NL tokens are generated when
       
    70    a logical line of code is continued over multiple physical lines.
       
    71 
       
    72 Another function is provided to reverse the tokenization process. This is useful
       
    73 for creating tools that tokenize a script, modify the token stream, and write
       
    74 back the modified script.
       
    75 
       
    76 
       
    77 .. function:: untokenize(iterable)
       
    78 
       
    79    Converts tokens back into Python source code.  The *iterable* must return
       
    80    sequences with at least two elements, the token type and the token string.  Any
       
    81    additional sequence elements are ignored.
       
    82 
       
    83    The reconstructed script is returned as a single string.  The result is
       
    84    guaranteed to tokenize back to match the input so that the conversion is
       
    85    lossless and round-trips are assured.  The guarantee applies only to the token
       
    86    type and token string as the spacing between tokens (column positions) may
       
    87    change.
       
    88 
       
    89    .. versionadded:: 2.5
       
    90 
       
    91 Example of a script re-writer that transforms float literals into Decimal
       
    92 objects::
       
    93 
       
    94    def decistmt(s):
       
    95        """Substitute Decimals for floats in a string of statements.
       
    96 
       
    97        >>> from decimal import Decimal
       
    98        >>> s = 'print +21.3e-5*-.1234/81.7'
       
    99        >>> decistmt(s)
       
   100        "print +Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7')"
       
   101 
       
   102        >>> exec(s)
       
   103        -3.21716034272e-007
       
   104        >>> exec(decistmt(s))
       
   105        -3.217160342717258261933904529E-7
       
   106 
       
   107        """
       
   108        result = []
       
   109        g = generate_tokens(StringIO(s).readline)   # tokenize the string
       
   110        for toknum, tokval, _, _, _  in g:
       
   111            if toknum == NUMBER and '.' in tokval:  # replace NUMBER tokens
       
   112                result.extend([
       
   113                    (NAME, 'Decimal'),
       
   114                    (OP, '('),
       
   115                    (STRING, repr(tokval)),
       
   116                    (OP, ')')
       
   117                ])
       
   118            else:
       
   119                result.append((toknum, tokval))
       
   120        return untokenize(result)
       
   121