symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/shlex.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 
       
     2 :mod:`shlex` --- Simple lexical analysis
       
     3 ========================================
       
     4 
       
     5 .. module:: shlex
       
     6    :synopsis: Simple lexical analysis for Unix shell-like languages.
       
     7 .. moduleauthor:: Eric S. Raymond <esr@snark.thyrsus.com>
       
     8 .. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
       
     9 .. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com>
       
    10 .. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
       
    11 
       
    12 
       
    13 .. versionadded:: 1.5.2
       
    14 
       
    15 The :class:`shlex` class makes it easy to write lexical analyzers for simple
       
    16 syntaxes resembling that of the Unix shell.  This will often be useful for
       
    17 writing minilanguages, (for example, in run control files for Python
       
    18 applications) or for parsing quoted strings.
       
    19 
       
    20 .. note::
       
    21 
       
    22    The :mod:`shlex` module currently does not support Unicode input.
       
    23 
       
    24 The :mod:`shlex` module defines the following functions:
       
    25 
       
    26 
       
    27 .. function:: split(s[, comments[, posix]])
       
    28 
       
    29    Split the string *s* using shell-like syntax. If *comments* is :const:`False`
       
    30    (the default), the parsing of comments in the given string will be disabled
       
    31    (setting the :attr:`commenters` member of the :class:`shlex` instance to the
       
    32    empty string).  This function operates in POSIX mode by default, but uses
       
    33    non-POSIX mode if the *posix* argument is false.
       
    34 
       
    35    .. versionadded:: 2.3
       
    36 
       
    37    .. versionchanged:: 2.6
       
    38       Added the *posix* parameter.
       
    39 
       
    40    .. note::
       
    41 
       
    42       Since the :func:`split` function instantiates a :class:`shlex` instance, passing
       
    43       ``None`` for *s* will read the string to split from standard input.
       
    44 
       
    45 The :mod:`shlex` module defines the following class:
       
    46 
       
    47 
       
    48 .. class:: shlex([instream[, infile[, posix]]])
       
    49 
       
    50    A :class:`shlex` instance or subclass instance is a lexical analyzer object.
       
    51    The initialization argument, if present, specifies where to read characters
       
    52    from. It must be a file-/stream-like object with :meth:`read` and
       
    53    :meth:`readline` methods, or a string (strings are accepted since Python 2.3).
       
    54    If no argument is given, input will be taken from ``sys.stdin``.  The second
       
    55    optional argument is a filename string, which sets the initial value of the
       
    56    :attr:`infile` member.  If the *instream* argument is omitted or equal to
       
    57    ``sys.stdin``, this second argument defaults to "stdin".  The *posix* argument
       
    58    was introduced in Python 2.3, and defines the operational mode.  When *posix* is
       
    59    not true (default), the :class:`shlex` instance will operate in compatibility
       
    60    mode.  When operating in POSIX mode, :class:`shlex` will try to be as close as
       
    61    possible to the POSIX shell parsing rules.
       
    62 
       
    63 
       
    64 .. seealso::
       
    65 
       
    66    Module :mod:`ConfigParser`
       
    67       Parser for configuration files similar to the Windows :file:`.ini` files.
       
    68 
       
    69 
       
    70 .. _shlex-objects:
       
    71 
       
    72 shlex Objects
       
    73 -------------
       
    74 
       
    75 A :class:`shlex` instance has the following methods:
       
    76 
       
    77 
       
    78 .. method:: shlex.get_token()
       
    79 
       
    80    Return a token.  If tokens have been stacked using :meth:`push_token`, pop a
       
    81    token off the stack.  Otherwise, read one from the input stream.  If reading
       
    82    encounters an immediate end-of-file, :attr:`self.eof` is returned (the empty
       
    83    string (``''``) in non-POSIX mode, and ``None`` in POSIX mode).
       
    84 
       
    85 
       
    86 .. method:: shlex.push_token(str)
       
    87 
       
    88    Push the argument onto the token stack.
       
    89 
       
    90 
       
    91 .. method:: shlex.read_token()
       
    92 
       
    93    Read a raw token.  Ignore the pushback stack, and do not interpret source
       
    94    requests.  (This is not ordinarily a useful entry point, and is documented here
       
    95    only for the sake of completeness.)
       
    96 
       
    97 
       
    98 .. method:: shlex.sourcehook(filename)
       
    99 
       
   100    When :class:`shlex` detects a source request (see :attr:`source` below) this
       
   101    method is given the following token as argument, and expected to return a tuple
       
   102    consisting of a filename and an open file-like object.
       
   103 
       
   104    Normally, this method first strips any quotes off the argument.  If the result
       
   105    is an absolute pathname, or there was no previous source request in effect, or
       
   106    the previous source was a stream (such as ``sys.stdin``), the result is left
       
   107    alone.  Otherwise, if the result is a relative pathname, the directory part of
       
   108    the name of the file immediately before it on the source inclusion stack is
       
   109    prepended (this behavior is like the way the C preprocessor handles ``#include
       
   110    "file.h"``).
       
   111 
       
   112    The result of the manipulations is treated as a filename, and returned as the
       
   113    first component of the tuple, with :func:`open` called on it to yield the second
       
   114    component. (Note: this is the reverse of the order of arguments in instance
       
   115    initialization!)
       
   116 
       
   117    This hook is exposed so that you can use it to implement directory search paths,
       
   118    addition of file extensions, and other namespace hacks. There is no
       
   119    corresponding 'close' hook, but a shlex instance will call the :meth:`close`
       
   120    method of the sourced input stream when it returns EOF.
       
   121 
       
   122    For more explicit control of source stacking, use the :meth:`push_source` and
       
   123    :meth:`pop_source` methods.
       
   124 
       
   125 
       
   126 .. method:: shlex.push_source(stream[, filename])
       
   127 
       
   128    Push an input source stream onto the input stack.  If the filename argument is
       
   129    specified it will later be available for use in error messages.  This is the
       
   130    same method used internally by the :meth:`sourcehook` method.
       
   131 
       
   132    .. versionadded:: 2.1
       
   133 
       
   134 
       
   135 .. method:: shlex.pop_source()
       
   136 
       
   137    Pop the last-pushed input source from the input stack. This is the same method
       
   138    used internally when the lexer reaches EOF on a stacked input stream.
       
   139 
       
   140    .. versionadded:: 2.1
       
   141 
       
   142 
       
   143 .. method:: shlex.error_leader([file[, line]])
       
   144 
       
   145    This method generates an error message leader in the format of a Unix C compiler
       
   146    error label; the format is ``'"%s", line %d: '``, where the ``%s`` is replaced
       
   147    with the name of the current source file and the ``%d`` with the current input
       
   148    line number (the optional arguments can be used to override these).
       
   149 
       
   150    This convenience is provided to encourage :mod:`shlex` users to generate error
       
   151    messages in the standard, parseable format understood by Emacs and other Unix
       
   152    tools.
       
   153 
       
   154 Instances of :class:`shlex` subclasses have some public instance variables which
       
   155 either control lexical analysis or can be used for debugging:
       
   156 
       
   157 
       
   158 .. attribute:: shlex.commenters
       
   159 
       
   160    The string of characters that are recognized as comment beginners. All
       
   161    characters from the comment beginner to end of line are ignored. Includes just
       
   162    ``'#'`` by default.
       
   163 
       
   164 
       
   165 .. attribute:: shlex.wordchars
       
   166 
       
   167    The string of characters that will accumulate into multi-character tokens.  By
       
   168    default, includes all ASCII alphanumerics and underscore.
       
   169 
       
   170 
       
   171 .. attribute:: shlex.whitespace
       
   172 
       
   173    Characters that will be considered whitespace and skipped.  Whitespace bounds
       
   174    tokens.  By default, includes space, tab, linefeed and carriage-return.
       
   175 
       
   176 
       
   177 .. attribute:: shlex.escape
       
   178 
       
   179    Characters that will be considered as escape. This will be only used in POSIX
       
   180    mode, and includes just ``'\'`` by default.
       
   181 
       
   182    .. versionadded:: 2.3
       
   183 
       
   184 
       
   185 .. attribute:: shlex.quotes
       
   186 
       
   187    Characters that will be considered string quotes.  The token accumulates until
       
   188    the same quote is encountered again (thus, different quote types protect each
       
   189    other as in the shell.)  By default, includes ASCII single and double quotes.
       
   190 
       
   191 
       
   192 .. attribute:: shlex.escapedquotes
       
   193 
       
   194    Characters in :attr:`quotes` that will interpret escape characters defined in
       
   195    :attr:`escape`.  This is only used in POSIX mode, and includes just ``'"'`` by
       
   196    default.
       
   197 
       
   198    .. versionadded:: 2.3
       
   199 
       
   200 
       
   201 .. attribute:: shlex.whitespace_split
       
   202 
       
   203    If ``True``, tokens will only be split in whitespaces. This is useful, for
       
   204    example, for parsing command lines with :class:`shlex`, getting tokens in a
       
   205    similar way to shell arguments.
       
   206 
       
   207    .. versionadded:: 2.3
       
   208 
       
   209 
       
   210 .. attribute:: shlex.infile
       
   211 
       
   212    The name of the current input file, as initially set at class instantiation time
       
   213    or stacked by later source requests.  It may be useful to examine this when
       
   214    constructing error messages.
       
   215 
       
   216 
       
   217 .. attribute:: shlex.instream
       
   218 
       
   219    The input stream from which this :class:`shlex` instance is reading characters.
       
   220 
       
   221 
       
   222 .. attribute:: shlex.source
       
   223 
       
   224    This member is ``None`` by default.  If you assign a string to it, that string
       
   225    will be recognized as a lexical-level inclusion request similar to the
       
   226    ``source`` keyword in various shells.  That is, the immediately following token
       
   227    will opened as a filename and input taken from that stream until EOF, at which
       
   228    point the :meth:`close` method of that stream will be called and the input
       
   229    source will again become the original input stream. Source requests may be
       
   230    stacked any number of levels deep.
       
   231 
       
   232 
       
   233 .. attribute:: shlex.debug
       
   234 
       
   235    If this member is numeric and ``1`` or more, a :class:`shlex` instance will
       
   236    print verbose progress output on its behavior.  If you need to use this, you can
       
   237    read the module source code to learn the details.
       
   238 
       
   239 
       
   240 .. attribute:: shlex.lineno
       
   241 
       
   242    Source line number (count of newlines seen so far plus one).
       
   243 
       
   244 
       
   245 .. attribute:: shlex.token
       
   246 
       
   247    The token buffer.  It may be useful to examine this when catching exceptions.
       
   248 
       
   249 
       
   250 .. attribute:: shlex.eof
       
   251 
       
   252    Token used to determine end of file. This will be set to the empty string
       
   253    (``''``), in non-POSIX mode, and to ``None`` in POSIX mode.
       
   254 
       
   255    .. versionadded:: 2.3
       
   256 
       
   257 
       
   258 .. _shlex-parsing-rules:
       
   259 
       
   260 Parsing Rules
       
   261 -------------
       
   262 
       
   263 When operating in non-POSIX mode, :class:`shlex` will try to obey to the
       
   264 following rules.
       
   265 
       
   266 * Quote characters are not recognized within words (``Do"Not"Separate`` is
       
   267   parsed as the single word ``Do"Not"Separate``);
       
   268 
       
   269 * Escape characters are not recognized;
       
   270 
       
   271 * Enclosing characters in quotes preserve the literal value of all characters
       
   272   within the quotes;
       
   273 
       
   274 * Closing quotes separate words (``"Do"Separate`` is parsed as ``"Do"`` and
       
   275   ``Separate``);
       
   276 
       
   277 * If :attr:`whitespace_split` is ``False``, any character not declared to be a
       
   278   word character, whitespace, or a quote will be returned as a single-character
       
   279   token. If it is ``True``, :class:`shlex` will only split words in whitespaces;
       
   280 
       
   281 * EOF is signaled with an empty string (``''``);
       
   282 
       
   283 * It's not possible to parse empty strings, even if quoted.
       
   284 
       
   285 When operating in POSIX mode, :class:`shlex` will try to obey to the following
       
   286 parsing rules.
       
   287 
       
   288 * Quotes are stripped out, and do not separate words (``"Do"Not"Separate"`` is
       
   289   parsed as the single word ``DoNotSeparate``);
       
   290 
       
   291 * Non-quoted escape characters (e.g. ``'\'``) preserve the literal value of the
       
   292   next character that follows;
       
   293 
       
   294 * Enclosing characters in quotes which are not part of :attr:`escapedquotes`
       
   295   (e.g. ``"'"``) preserve the literal value of all characters within the quotes;
       
   296 
       
   297 * Enclosing characters in quotes which are part of :attr:`escapedquotes` (e.g.
       
   298   ``'"'``) preserves the literal value of all characters within the quotes, with
       
   299   the exception of the characters mentioned in :attr:`escape`. The escape
       
   300   characters retain its special meaning only when followed by the quote in use, or
       
   301   the escape character itself. Otherwise the escape character will be considered a
       
   302   normal character.
       
   303 
       
   304 * EOF is signaled with a :const:`None` value;
       
   305 
       
   306 * Quoted empty strings (``''``) are allowed;
       
   307