symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/pyexpat.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 
       
     2 :mod:`xml.parsers.expat` --- Fast XML parsing using Expat
       
     3 =========================================================
       
     4 
       
     5 .. module:: xml.parsers.expat
       
     6    :synopsis: An interface to the Expat non-validating XML parser.
       
     7 .. moduleauthor:: Paul Prescod <paul@prescod.net>
       
     8 
       
     9 
       
    10 .. Markup notes:
       
    11 
       
    12    Many of the attributes of the XMLParser objects are callbacks.  Since
       
    13    signature information must be presented, these are described using the method
       
    14    directive.  Since they are attributes which are set by client code, in-text
       
    15    references to these attributes should be marked using the :member: role.
       
    16 
       
    17 .. versionadded:: 2.0
       
    18 
       
    19 .. index:: single: Expat
       
    20 
       
    21 The :mod:`xml.parsers.expat` module is a Python interface to the Expat
       
    22 non-validating XML parser. The module provides a single extension type,
       
    23 :class:`xmlparser`, that represents the current state of an XML parser.  After
       
    24 an :class:`xmlparser` object has been created, various attributes of the object
       
    25 can be set to handler functions.  When an XML document is then fed to the
       
    26 parser, the handler functions are called for the character data and markup in
       
    27 the XML document.
       
    28 
       
    29 .. index:: module: pyexpat
       
    30 
       
    31 This module uses the :mod:`pyexpat` module to provide access to the Expat
       
    32 parser.  Direct use of the :mod:`pyexpat` module is deprecated.
       
    33 
       
    34 This module provides one exception and one type object:
       
    35 
       
    36 
       
    37 .. exception:: ExpatError
       
    38 
       
    39    The exception raised when Expat reports an error.  See section
       
    40    :ref:`expaterror-objects` for more information on interpreting Expat errors.
       
    41 
       
    42 
       
    43 .. exception:: error
       
    44 
       
    45    Alias for :exc:`ExpatError`.
       
    46 
       
    47 
       
    48 .. data:: XMLParserType
       
    49 
       
    50    The type of the return values from the :func:`ParserCreate` function.
       
    51 
       
    52 The :mod:`xml.parsers.expat` module contains two functions:
       
    53 
       
    54 
       
    55 .. function:: ErrorString(errno)
       
    56 
       
    57    Returns an explanatory string for a given error number *errno*.
       
    58 
       
    59 
       
    60 .. function:: ParserCreate([encoding[, namespace_separator]])
       
    61 
       
    62    Creates and returns a new :class:`xmlparser` object.   *encoding*, if specified,
       
    63    must be a string naming the encoding  used by the XML data.  Expat doesn't
       
    64    support as many encodings as Python does, and its repertoire of encodings can't
       
    65    be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII.  If
       
    66    *encoding* [1]_ is given it will override the implicit or explicit encoding of the
       
    67    document.
       
    68 
       
    69    Expat can optionally do XML namespace processing for you, enabled by providing a
       
    70    value for *namespace_separator*.  The value must be a one-character string; a
       
    71    :exc:`ValueError` will be raised if the string has an illegal length (``None``
       
    72    is considered the same as omission).  When namespace processing is enabled,
       
    73    element type names and attribute names that belong to a namespace will be
       
    74    expanded.  The element name passed to the element handlers
       
    75    :attr:`StartElementHandler` and :attr:`EndElementHandler` will be the
       
    76    concatenation of the namespace URI, the namespace separator character, and the
       
    77    local part of the name.  If the namespace separator is a zero byte (``chr(0)``)
       
    78    then the namespace URI and the local part will be concatenated without any
       
    79    separator.
       
    80 
       
    81    For example, if *namespace_separator* is set to a space character (``' '``) and
       
    82    the following document is parsed::
       
    83 
       
    84       <?xml version="1.0"?>
       
    85       <root xmlns    = "http://default-namespace.org/"
       
    86             xmlns:py = "http://www.python.org/ns/">
       
    87         <py:elem1 />
       
    88         <elem2 xmlns="" />
       
    89       </root>
       
    90 
       
    91    :attr:`StartElementHandler` will receive the following strings for each
       
    92    element::
       
    93 
       
    94       http://default-namespace.org/ root
       
    95       http://www.python.org/ns/ elem1
       
    96       elem2
       
    97 
       
    98 
       
    99 .. seealso::
       
   100 
       
   101    `The Expat XML Parser <http://www.libexpat.org/>`_
       
   102       Home page of the Expat project.
       
   103 
       
   104 
       
   105 .. _xmlparser-objects:
       
   106 
       
   107 XMLParser Objects
       
   108 -----------------
       
   109 
       
   110 :class:`xmlparser` objects have the following methods:
       
   111 
       
   112 
       
   113 .. method:: xmlparser.Parse(data[, isfinal])
       
   114 
       
   115    Parses the contents of the string *data*, calling the appropriate handler
       
   116    functions to process the parsed data.  *isfinal* must be true on the final call
       
   117    to this method.  *data* can be the empty string at any time.
       
   118 
       
   119 
       
   120 .. method:: xmlparser.ParseFile(file)
       
   121 
       
   122    Parse XML data reading from the object *file*.  *file* only needs to provide
       
   123    the ``read(nbytes)`` method, returning the empty string when there's no more
       
   124    data.
       
   125 
       
   126 
       
   127 .. method:: xmlparser.SetBase(base)
       
   128 
       
   129    Sets the base to be used for resolving relative URIs in system identifiers in
       
   130    declarations.  Resolving relative identifiers is left to the application: this
       
   131    value will be passed through as the *base* argument to the
       
   132    :func:`ExternalEntityRefHandler`, :func:`NotationDeclHandler`, and
       
   133    :func:`UnparsedEntityDeclHandler` functions.
       
   134 
       
   135 
       
   136 .. method:: xmlparser.GetBase()
       
   137 
       
   138    Returns a string containing the base set by a previous call to :meth:`SetBase`,
       
   139    or ``None`` if  :meth:`SetBase` hasn't been called.
       
   140 
       
   141 
       
   142 .. method:: xmlparser.GetInputContext()
       
   143 
       
   144    Returns the input data that generated the current event as a string. The data is
       
   145    in the encoding of the entity which contains the text. When called while an
       
   146    event handler is not active, the return value is ``None``.
       
   147 
       
   148    .. versionadded:: 2.1
       
   149 
       
   150 
       
   151 .. method:: xmlparser.ExternalEntityParserCreate(context[, encoding])
       
   152 
       
   153    Create a "child" parser which can be used to parse an external parsed entity
       
   154    referred to by content parsed by the parent parser.  The *context* parameter
       
   155    should be the string passed to the :meth:`ExternalEntityRefHandler` handler
       
   156    function, described below. The child parser is created with the
       
   157    :attr:`ordered_attributes`, :attr:`returns_unicode` and
       
   158    :attr:`specified_attributes` set to the values of this parser.
       
   159 
       
   160 
       
   161 .. method:: xmlparser.UseForeignDTD([flag])
       
   162 
       
   163    Calling this with a true value for *flag* (the default) will cause Expat to call
       
   164    the :attr:`ExternalEntityRefHandler` with :const:`None` for all arguments to
       
   165    allow an alternate DTD to be loaded.  If the document does not contain a
       
   166    document type declaration, the :attr:`ExternalEntityRefHandler` will still be
       
   167    called, but the :attr:`StartDoctypeDeclHandler` and
       
   168    :attr:`EndDoctypeDeclHandler` will not be called.
       
   169 
       
   170    Passing a false value for *flag* will cancel a previous call that passed a true
       
   171    value, but otherwise has no effect.
       
   172 
       
   173    This method can only be called before the :meth:`Parse` or :meth:`ParseFile`
       
   174    methods are called; calling it after either of those have been called causes
       
   175    :exc:`ExpatError` to be raised with the :attr:`code` attribute set to
       
   176    :const:`errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING`.
       
   177 
       
   178    .. versionadded:: 2.3
       
   179 
       
   180 :class:`xmlparser` objects have the following attributes:
       
   181 
       
   182 
       
   183 .. attribute:: xmlparser.buffer_size
       
   184 
       
   185    The size of the buffer used when :attr:`buffer_text` is true.  
       
   186    A new buffer size can be set by assigning a new integer value 
       
   187    to this attribute.  
       
   188    When the size is changed, the buffer will be flushed.
       
   189 
       
   190    .. versionadded:: 2.3
       
   191 
       
   192    .. versionchanged:: 2.6
       
   193       The buffer size can now be changed.
       
   194 
       
   195 .. attribute:: xmlparser.buffer_text
       
   196 
       
   197    Setting this to true causes the :class:`xmlparser` object to buffer textual
       
   198    content returned by Expat to avoid multiple calls to the
       
   199    :meth:`CharacterDataHandler` callback whenever possible.  This can improve
       
   200    performance substantially since Expat normally breaks character data into chunks
       
   201    at every line ending.  This attribute is false by default, and may be changed at
       
   202    any time.
       
   203 
       
   204    .. versionadded:: 2.3
       
   205 
       
   206 
       
   207 .. attribute:: xmlparser.buffer_used
       
   208 
       
   209    If :attr:`buffer_text` is enabled, the number of bytes stored in the buffer.
       
   210    These bytes represent UTF-8 encoded text.  This attribute has no meaningful
       
   211    interpretation when :attr:`buffer_text` is false.
       
   212 
       
   213    .. versionadded:: 2.3
       
   214 
       
   215 
       
   216 .. attribute:: xmlparser.ordered_attributes
       
   217 
       
   218    Setting this attribute to a non-zero integer causes the attributes to be
       
   219    reported as a list rather than a dictionary.  The attributes are presented in
       
   220    the order found in the document text.  For each attribute, two list entries are
       
   221    presented: the attribute name and the attribute value.  (Older versions of this
       
   222    module also used this format.)  By default, this attribute is false; it may be
       
   223    changed at any time.
       
   224 
       
   225    .. versionadded:: 2.1
       
   226 
       
   227 
       
   228 .. attribute:: xmlparser.returns_unicode
       
   229 
       
   230    If this attribute is set to a non-zero integer, the handler functions will be
       
   231    passed Unicode strings.  If :attr:`returns_unicode` is :const:`False`, 8-bit
       
   232    strings containing UTF-8 encoded data will be passed to the handlers.  This is
       
   233    :const:`True` by default when Python is built with Unicode support.
       
   234 
       
   235    .. versionchanged:: 1.6
       
   236       Can be changed at any time to affect the result type.
       
   237 
       
   238 
       
   239 .. attribute:: xmlparser.specified_attributes
       
   240 
       
   241    If set to a non-zero integer, the parser will report only those attributes which
       
   242    were specified in the document instance and not those which were derived from
       
   243    attribute declarations.  Applications which set this need to be especially
       
   244    careful to use what additional information is available from the declarations as
       
   245    needed to comply with the standards for the behavior of XML processors.  By
       
   246    default, this attribute is false; it may be changed at any time.
       
   247 
       
   248    .. versionadded:: 2.1
       
   249 
       
   250 The following attributes contain values relating to the most recent error
       
   251 encountered by an :class:`xmlparser` object, and will only have correct values
       
   252 once a call to :meth:`Parse` or :meth:`ParseFile` has raised a
       
   253 :exc:`xml.parsers.expat.ExpatError` exception.
       
   254 
       
   255 
       
   256 .. attribute:: xmlparser.ErrorByteIndex
       
   257 
       
   258    Byte index at which an error occurred.
       
   259 
       
   260 
       
   261 .. attribute:: xmlparser.ErrorCode
       
   262 
       
   263    Numeric code specifying the problem.  This value can be passed to the
       
   264    :func:`ErrorString` function, or compared to one of the constants defined in the
       
   265    ``errors`` object.
       
   266 
       
   267 
       
   268 .. attribute:: xmlparser.ErrorColumnNumber
       
   269 
       
   270    Column number at which an error occurred.
       
   271 
       
   272 
       
   273 .. attribute:: xmlparser.ErrorLineNumber
       
   274 
       
   275    Line number at which an error occurred.
       
   276 
       
   277 The following attributes contain values relating to the current parse location
       
   278 in an :class:`xmlparser` object.  During a callback reporting a parse event they
       
   279 indicate the location of the first of the sequence of characters that generated
       
   280 the event.  When called outside of a callback, the position indicated will be
       
   281 just past the last parse event (regardless of whether there was an associated
       
   282 callback).
       
   283 
       
   284 .. versionadded:: 2.4
       
   285 
       
   286 
       
   287 .. attribute:: xmlparser.CurrentByteIndex
       
   288 
       
   289    Current byte index in the parser input.
       
   290 
       
   291 
       
   292 .. attribute:: xmlparser.CurrentColumnNumber
       
   293 
       
   294    Current column number in the parser input.
       
   295 
       
   296 
       
   297 .. attribute:: xmlparser.CurrentLineNumber
       
   298 
       
   299    Current line number in the parser input.
       
   300 
       
   301 Here is the list of handlers that can be set.  To set a handler on an
       
   302 :class:`xmlparser` object *o*, use ``o.handlername = func``.  *handlername* must
       
   303 be taken from the following list, and *func* must be a callable object accepting
       
   304 the correct number of arguments.  The arguments are all strings, unless
       
   305 otherwise stated.
       
   306 
       
   307 
       
   308 .. method:: xmlparser.XmlDeclHandler(version, encoding, standalone)
       
   309 
       
   310    Called when the XML declaration is parsed.  The XML declaration is the
       
   311    (optional) declaration of the applicable version of the XML recommendation, the
       
   312    encoding of the document text, and an optional "standalone" declaration.
       
   313    *version* and *encoding* will be strings of the type dictated by the
       
   314    :attr:`returns_unicode` attribute, and *standalone* will be ``1`` if the
       
   315    document is declared standalone, ``0`` if it is declared not to be standalone,
       
   316    or ``-1`` if the standalone clause was omitted. This is only available with
       
   317    Expat version 1.95.0 or newer.
       
   318 
       
   319    .. versionadded:: 2.1
       
   320 
       
   321 
       
   322 .. method:: xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset)
       
   323 
       
   324    Called when Expat begins parsing the document type declaration (``<!DOCTYPE
       
   325    ...``).  The *doctypeName* is provided exactly as presented.  The *systemId* and
       
   326    *publicId* parameters give the system and public identifiers if specified, or
       
   327    ``None`` if omitted.  *has_internal_subset* will be true if the document
       
   328    contains and internal document declaration subset. This requires Expat version
       
   329    1.2 or newer.
       
   330 
       
   331 
       
   332 .. method:: xmlparser.EndDoctypeDeclHandler()
       
   333 
       
   334    Called when Expat is done parsing the document type declaration. This requires
       
   335    Expat version 1.2 or newer.
       
   336 
       
   337 
       
   338 .. method:: xmlparser.ElementDeclHandler(name, model)
       
   339 
       
   340    Called once for each element type declaration.  *name* is the name of the
       
   341    element type, and *model* is a representation of the content model.
       
   342 
       
   343 
       
   344 .. method:: xmlparser.AttlistDeclHandler(elname, attname, type, default, required)
       
   345 
       
   346    Called for each declared attribute for an element type.  If an attribute list
       
   347    declaration declares three attributes, this handler is called three times, once
       
   348    for each attribute.  *elname* is the name of the element to which the
       
   349    declaration applies and *attname* is the name of the attribute declared.  The
       
   350    attribute type is a string passed as *type*; the possible values are
       
   351    ``'CDATA'``, ``'ID'``, ``'IDREF'``, ... *default* gives the default value for
       
   352    the attribute used when the attribute is not specified by the document instance,
       
   353    or ``None`` if there is no default value (``#IMPLIED`` values).  If the
       
   354    attribute is required to be given in the document instance, *required* will be
       
   355    true. This requires Expat version 1.95.0 or newer.
       
   356 
       
   357 
       
   358 .. method:: xmlparser.StartElementHandler(name, attributes)
       
   359 
       
   360    Called for the start of every element.  *name* is a string containing the
       
   361    element name, and *attributes* is a dictionary mapping attribute names to their
       
   362    values.
       
   363 
       
   364 
       
   365 .. method:: xmlparser.EndElementHandler(name)
       
   366 
       
   367    Called for the end of every element.
       
   368 
       
   369 
       
   370 .. method:: xmlparser.ProcessingInstructionHandler(target, data)
       
   371 
       
   372    Called for every processing instruction.
       
   373 
       
   374 
       
   375 .. method:: xmlparser.CharacterDataHandler(data)
       
   376 
       
   377    Called for character data.  This will be called for normal character data, CDATA
       
   378    marked content, and ignorable whitespace.  Applications which must distinguish
       
   379    these cases can use the :attr:`StartCdataSectionHandler`,
       
   380    :attr:`EndCdataSectionHandler`, and :attr:`ElementDeclHandler` callbacks to
       
   381    collect the required information.
       
   382 
       
   383 
       
   384 .. method:: xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName)
       
   385 
       
   386    Called for unparsed (NDATA) entity declarations.  This is only present for
       
   387    version 1.2 of the Expat library; for more recent versions, use
       
   388    :attr:`EntityDeclHandler` instead.  (The underlying function in the Expat
       
   389    library has been declared obsolete.)
       
   390 
       
   391 
       
   392 .. method:: xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName)
       
   393 
       
   394    Called for all entity declarations.  For parameter and internal entities,
       
   395    *value* will be a string giving the declared contents of the entity; this will
       
   396    be ``None`` for external entities.  The *notationName* parameter will be
       
   397    ``None`` for parsed entities, and the name of the notation for unparsed
       
   398    entities. *is_parameter_entity* will be true if the entity is a parameter entity
       
   399    or false for general entities (most applications only need to be concerned with
       
   400    general entities). This is only available starting with version 1.95.0 of the
       
   401    Expat library.
       
   402 
       
   403    .. versionadded:: 2.1
       
   404 
       
   405 
       
   406 .. method:: xmlparser.NotationDeclHandler(notationName, base, systemId, publicId)
       
   407 
       
   408    Called for notation declarations.  *notationName*, *base*, and *systemId*, and
       
   409    *publicId* are strings if given.  If the public identifier is omitted,
       
   410    *publicId* will be ``None``.
       
   411 
       
   412 
       
   413 .. method:: xmlparser.StartNamespaceDeclHandler(prefix, uri)
       
   414 
       
   415    Called when an element contains a namespace declaration.  Namespace declarations
       
   416    are processed before the :attr:`StartElementHandler` is called for the element
       
   417    on which declarations are placed.
       
   418 
       
   419 
       
   420 .. method:: xmlparser.EndNamespaceDeclHandler(prefix)
       
   421 
       
   422    Called when the closing tag is reached for an element  that contained a
       
   423    namespace declaration.  This is called once for each namespace declaration on
       
   424    the element in the reverse of the order for which the
       
   425    :attr:`StartNamespaceDeclHandler` was called to indicate the start of each
       
   426    namespace declaration's scope.  Calls to this handler are made after the
       
   427    corresponding :attr:`EndElementHandler` for the end of the element.
       
   428 
       
   429 
       
   430 .. method:: xmlparser.CommentHandler(data)
       
   431 
       
   432    Called for comments.  *data* is the text of the comment, excluding the leading
       
   433    '``<!-``\ ``-``' and trailing '``-``\ ``->``'.
       
   434 
       
   435 
       
   436 .. method:: xmlparser.StartCdataSectionHandler()
       
   437 
       
   438    Called at the start of a CDATA section.  This and :attr:`EndCdataSectionHandler`
       
   439    are needed to be able to identify the syntactical start and end for CDATA
       
   440    sections.
       
   441 
       
   442 
       
   443 .. method:: xmlparser.EndCdataSectionHandler()
       
   444 
       
   445    Called at the end of a CDATA section.
       
   446 
       
   447 
       
   448 .. method:: xmlparser.DefaultHandler(data)
       
   449 
       
   450    Called for any characters in the XML document for which no applicable handler
       
   451    has been specified.  This means characters that are part of a construct which
       
   452    could be reported, but for which no handler has been supplied.
       
   453 
       
   454 
       
   455 .. method:: xmlparser.DefaultHandlerExpand(data)
       
   456 
       
   457    This is the same as the :func:`DefaultHandler`,  but doesn't inhibit expansion
       
   458    of internal entities. The entity reference will not be passed to the default
       
   459    handler.
       
   460 
       
   461 
       
   462 .. method:: xmlparser.NotStandaloneHandler()
       
   463 
       
   464    Called if the XML document hasn't been declared as being a standalone document.
       
   465    This happens when there is an external subset or a reference to a parameter
       
   466    entity, but the XML declaration does not set standalone to ``yes`` in an XML
       
   467    declaration.  If this handler returns ``0``, then the parser will throw an
       
   468    :const:`XML_ERROR_NOT_STANDALONE` error.  If this handler is not set, no
       
   469    exception is raised by the parser for this condition.
       
   470 
       
   471 
       
   472 .. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId)
       
   473 
       
   474    Called for references to external entities.  *base* is the current base, as set
       
   475    by a previous call to :meth:`SetBase`.  The public and system identifiers,
       
   476    *systemId* and *publicId*, are strings if given; if the public identifier is not
       
   477    given, *publicId* will be ``None``.  The *context* value is opaque and should
       
   478    only be used as described below.
       
   479 
       
   480    For external entities to be parsed, this handler must be implemented. It is
       
   481    responsible for creating the sub-parser using
       
   482    ``ExternalEntityParserCreate(context)``, initializing it with the appropriate
       
   483    callbacks, and parsing the entity.  This handler should return an integer; if it
       
   484    returns ``0``, the parser will throw an
       
   485    :const:`XML_ERROR_EXTERNAL_ENTITY_HANDLING` error, otherwise parsing will
       
   486    continue.
       
   487 
       
   488    If this handler is not provided, external entities are reported by the
       
   489    :attr:`DefaultHandler` callback, if provided.
       
   490 
       
   491 
       
   492 .. _expaterror-objects:
       
   493 
       
   494 ExpatError Exceptions
       
   495 ---------------------
       
   496 
       
   497 .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
       
   498 
       
   499 
       
   500 :exc:`ExpatError` exceptions have a number of interesting attributes:
       
   501 
       
   502 
       
   503 .. attribute:: ExpatError.code
       
   504 
       
   505    Expat's internal error number for the specific error.  This will match one of
       
   506    the constants defined in the ``errors`` object from this module.
       
   507 
       
   508    .. versionadded:: 2.1
       
   509 
       
   510 
       
   511 .. attribute:: ExpatError.lineno
       
   512 
       
   513    Line number on which the error was detected.  The first line is numbered ``1``.
       
   514 
       
   515    .. versionadded:: 2.1
       
   516 
       
   517 
       
   518 .. attribute:: ExpatError.offset
       
   519 
       
   520    Character offset into the line where the error occurred.  The first column is
       
   521    numbered ``0``.
       
   522 
       
   523    .. versionadded:: 2.1
       
   524 
       
   525 
       
   526 .. _expat-example:
       
   527 
       
   528 Example
       
   529 -------
       
   530 
       
   531 The following program defines three handlers that just print out their
       
   532 arguments. ::
       
   533 
       
   534    import xml.parsers.expat
       
   535 
       
   536    # 3 handler functions
       
   537    def start_element(name, attrs):
       
   538        print 'Start element:', name, attrs
       
   539    def end_element(name):
       
   540        print 'End element:', name
       
   541    def char_data(data):
       
   542        print 'Character data:', repr(data)
       
   543 
       
   544    p = xml.parsers.expat.ParserCreate()
       
   545 
       
   546    p.StartElementHandler = start_element
       
   547    p.EndElementHandler = end_element
       
   548    p.CharacterDataHandler = char_data
       
   549 
       
   550    p.Parse("""<?xml version="1.0"?>
       
   551    <parent id="top"><child1 name="paul">Text goes here</child1>
       
   552    <child2 name="fred">More text</child2>
       
   553    </parent>""", 1)
       
   554 
       
   555 The output from this program is::
       
   556 
       
   557    Start element: parent {'id': 'top'}
       
   558    Start element: child1 {'name': 'paul'}
       
   559    Character data: 'Text goes here'
       
   560    End element: child1
       
   561    Character data: '\n'
       
   562    Start element: child2 {'name': 'fred'}
       
   563    Character data: 'More text'
       
   564    End element: child2
       
   565    Character data: '\n'
       
   566    End element: parent
       
   567 
       
   568 
       
   569 .. _expat-content-models:
       
   570 
       
   571 Content Model Descriptions
       
   572 --------------------------
       
   573 
       
   574 .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
       
   575 
       
   576 
       
   577 Content modules are described using nested tuples.  Each tuple contains four
       
   578 values: the type, the quantifier, the name, and a tuple of children.  Children
       
   579 are simply additional content module descriptions.
       
   580 
       
   581 The values of the first two fields are constants defined in the ``model`` object
       
   582 of the :mod:`xml.parsers.expat` module.  These constants can be collected in two
       
   583 groups: the model type group and the quantifier group.
       
   584 
       
   585 The constants in the model type group are:
       
   586 
       
   587 
       
   588 .. data:: XML_CTYPE_ANY
       
   589    :noindex:
       
   590 
       
   591    The element named by the model name was declared to have a content model of
       
   592    ``ANY``.
       
   593 
       
   594 
       
   595 .. data:: XML_CTYPE_CHOICE
       
   596    :noindex:
       
   597 
       
   598    The named element allows a choice from a number of options; this is used for
       
   599    content models such as ``(A | B | C)``.
       
   600 
       
   601 
       
   602 .. data:: XML_CTYPE_EMPTY
       
   603    :noindex:
       
   604 
       
   605    Elements which are declared to be ``EMPTY`` have this model type.
       
   606 
       
   607 
       
   608 .. data:: XML_CTYPE_MIXED
       
   609    :noindex:
       
   610 
       
   611 
       
   612 .. data:: XML_CTYPE_NAME
       
   613    :noindex:
       
   614 
       
   615 
       
   616 .. data:: XML_CTYPE_SEQ
       
   617    :noindex:
       
   618 
       
   619    Models which represent a series of models which follow one after the other are
       
   620    indicated with this model type.  This is used for models such as ``(A, B, C)``.
       
   621 
       
   622 The constants in the quantifier group are:
       
   623 
       
   624 
       
   625 .. data:: XML_CQUANT_NONE
       
   626    :noindex:
       
   627 
       
   628    No modifier is given, so it can appear exactly once, as for ``A``.
       
   629 
       
   630 
       
   631 .. data:: XML_CQUANT_OPT
       
   632    :noindex:
       
   633 
       
   634    The model is optional: it can appear once or not at all, as for ``A?``.
       
   635 
       
   636 
       
   637 .. data:: XML_CQUANT_PLUS
       
   638    :noindex:
       
   639 
       
   640    The model must occur one or more times (like ``A+``).
       
   641 
       
   642 
       
   643 .. data:: XML_CQUANT_REP
       
   644    :noindex:
       
   645 
       
   646    The model must occur zero or more times, as for ``A*``.
       
   647 
       
   648 
       
   649 .. _expat-errors:
       
   650 
       
   651 Expat error constants
       
   652 ---------------------
       
   653 
       
   654 The following constants are provided in the ``errors`` object of the
       
   655 :mod:`xml.parsers.expat` module.  These constants are useful in interpreting
       
   656 some of the attributes of the :exc:`ExpatError` exception objects raised when an
       
   657 error has occurred.
       
   658 
       
   659 The ``errors`` object has the following attributes:
       
   660 
       
   661 
       
   662 .. data:: XML_ERROR_ASYNC_ENTITY
       
   663    :noindex:
       
   664 
       
   665 
       
   666 .. data:: XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF
       
   667    :noindex:
       
   668 
       
   669    An entity reference in an attribute value referred to an external entity instead
       
   670    of an internal entity.
       
   671 
       
   672 
       
   673 .. data:: XML_ERROR_BAD_CHAR_REF
       
   674    :noindex:
       
   675 
       
   676    A character reference referred to a character which is illegal in XML (for
       
   677    example, character ``0``, or '``&#0;``').
       
   678 
       
   679 
       
   680 .. data:: XML_ERROR_BINARY_ENTITY_REF
       
   681    :noindex:
       
   682 
       
   683    An entity reference referred to an entity which was declared with a notation, so
       
   684    cannot be parsed.
       
   685 
       
   686 
       
   687 .. data:: XML_ERROR_DUPLICATE_ATTRIBUTE
       
   688    :noindex:
       
   689 
       
   690    An attribute was used more than once in a start tag.
       
   691 
       
   692 
       
   693 .. data:: XML_ERROR_INCORRECT_ENCODING
       
   694    :noindex:
       
   695 
       
   696 
       
   697 .. data:: XML_ERROR_INVALID_TOKEN
       
   698    :noindex:
       
   699 
       
   700    Raised when an input byte could not properly be assigned to a character; for
       
   701    example, a NUL byte (value ``0``) in a UTF-8 input stream.
       
   702 
       
   703 
       
   704 .. data:: XML_ERROR_JUNK_AFTER_DOC_ELEMENT
       
   705    :noindex:
       
   706 
       
   707    Something other than whitespace occurred after the document element.
       
   708 
       
   709 
       
   710 .. data:: XML_ERROR_MISPLACED_XML_PI
       
   711    :noindex:
       
   712 
       
   713    An XML declaration was found somewhere other than the start of the input data.
       
   714 
       
   715 
       
   716 .. data:: XML_ERROR_NO_ELEMENTS
       
   717    :noindex:
       
   718 
       
   719    The document contains no elements (XML requires all documents to contain exactly
       
   720    one top-level element)..
       
   721 
       
   722 
       
   723 .. data:: XML_ERROR_NO_MEMORY
       
   724    :noindex:
       
   725 
       
   726    Expat was not able to allocate memory internally.
       
   727 
       
   728 
       
   729 .. data:: XML_ERROR_PARAM_ENTITY_REF
       
   730    :noindex:
       
   731 
       
   732    A parameter entity reference was found where it was not allowed.
       
   733 
       
   734 
       
   735 .. data:: XML_ERROR_PARTIAL_CHAR
       
   736    :noindex:
       
   737 
       
   738    An incomplete character was found in the input.
       
   739 
       
   740 
       
   741 .. data:: XML_ERROR_RECURSIVE_ENTITY_REF
       
   742    :noindex:
       
   743 
       
   744    An entity reference contained another reference to the same entity; possibly via
       
   745    a different name, and possibly indirectly.
       
   746 
       
   747 
       
   748 .. data:: XML_ERROR_SYNTAX
       
   749    :noindex:
       
   750 
       
   751    Some unspecified syntax error was encountered.
       
   752 
       
   753 
       
   754 .. data:: XML_ERROR_TAG_MISMATCH
       
   755    :noindex:
       
   756 
       
   757    An end tag did not match the innermost open start tag.
       
   758 
       
   759 
       
   760 .. data:: XML_ERROR_UNCLOSED_TOKEN
       
   761    :noindex:
       
   762 
       
   763    Some token (such as a start tag) was not closed before the end of the stream or
       
   764    the next token was encountered.
       
   765 
       
   766 
       
   767 .. data:: XML_ERROR_UNDEFINED_ENTITY
       
   768    :noindex:
       
   769 
       
   770    A reference was made to a entity which was not defined.
       
   771 
       
   772 
       
   773 .. data:: XML_ERROR_UNKNOWN_ENCODING
       
   774    :noindex:
       
   775 
       
   776    The document encoding is not supported by Expat.
       
   777 
       
   778 
       
   779 .. data:: XML_ERROR_UNCLOSED_CDATA_SECTION
       
   780    :noindex:
       
   781 
       
   782    A CDATA marked section was not closed.
       
   783 
       
   784 
       
   785 .. data:: XML_ERROR_EXTERNAL_ENTITY_HANDLING
       
   786    :noindex:
       
   787 
       
   788 
       
   789 .. data:: XML_ERROR_NOT_STANDALONE
       
   790    :noindex:
       
   791 
       
   792    The parser determined that the document was not "standalone" though it declared
       
   793    itself to be in the XML declaration, and the :attr:`NotStandaloneHandler` was
       
   794    set and returned ``0``.
       
   795 
       
   796 
       
   797 .. data:: XML_ERROR_UNEXPECTED_STATE
       
   798    :noindex:
       
   799 
       
   800 
       
   801 .. data:: XML_ERROR_ENTITY_DECLARED_IN_PE
       
   802    :noindex:
       
   803 
       
   804 
       
   805 .. data:: XML_ERROR_FEATURE_REQUIRES_XML_DTD
       
   806    :noindex:
       
   807 
       
   808    An operation was requested that requires DTD support to be compiled in, but
       
   809    Expat was configured without DTD support.  This should never be reported by a
       
   810    standard build of the :mod:`xml.parsers.expat` module.
       
   811 
       
   812 
       
   813 .. data:: XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING
       
   814    :noindex:
       
   815 
       
   816    A behavioral change was requested after parsing started that can only be changed
       
   817    before parsing has started.  This is (currently) only raised by
       
   818    :meth:`UseForeignDTD`.
       
   819 
       
   820 
       
   821 .. data:: XML_ERROR_UNBOUND_PREFIX
       
   822    :noindex:
       
   823 
       
   824    An undeclared prefix was found when namespace processing was enabled.
       
   825 
       
   826 
       
   827 .. data:: XML_ERROR_UNDECLARING_PREFIX
       
   828    :noindex:
       
   829 
       
   830    The document attempted to remove the namespace declaration associated with a
       
   831    prefix.
       
   832 
       
   833 
       
   834 .. data:: XML_ERROR_INCOMPLETE_PE
       
   835    :noindex:
       
   836 
       
   837    A parameter entity contained incomplete markup.
       
   838 
       
   839 
       
   840 .. data:: XML_ERROR_XML_DECL
       
   841    :noindex:
       
   842 
       
   843    The document contained no document element at all.
       
   844 
       
   845 
       
   846 .. data:: XML_ERROR_TEXT_DECL
       
   847    :noindex:
       
   848 
       
   849    There was an error parsing a text declaration in an external entity.
       
   850 
       
   851 
       
   852 .. data:: XML_ERROR_PUBLICID
       
   853    :noindex:
       
   854 
       
   855    Characters were found in the public id that are not allowed.
       
   856 
       
   857 
       
   858 .. data:: XML_ERROR_SUSPENDED
       
   859    :noindex:
       
   860 
       
   861    The requested operation was made on a suspended parser, but isn't allowed.  This
       
   862    includes attempts to provide additional input or to stop the parser.
       
   863 
       
   864 
       
   865 .. data:: XML_ERROR_NOT_SUSPENDED
       
   866    :noindex:
       
   867 
       
   868    An attempt to resume the parser was made when the parser had not been suspended.
       
   869 
       
   870 
       
   871 .. data:: XML_ERROR_ABORTED
       
   872    :noindex:
       
   873 
       
   874    This should not be reported to Python applications.
       
   875 
       
   876 
       
   877 .. data:: XML_ERROR_FINISHED
       
   878    :noindex:
       
   879 
       
   880    The requested operation was made on a parser which was finished parsing input,
       
   881    but isn't allowed.  This includes attempts to provide additional input or to
       
   882    stop the parser.
       
   883 
       
   884 
       
   885 .. data:: XML_ERROR_SUSPEND_PE
       
   886    :noindex:
       
   887 
       
   888 
       
   889 .. rubric:: Footnotes
       
   890 
       
   891 .. [#] The encoding string included in XML output should conform to the
       
   892    appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
       
   893    not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
       
   894    and http://www.iana.org/assignments/character-sets .
       
   895