symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/struct.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 
       
     2 :mod:`struct` --- Interpret strings as packed binary data
       
     3 =========================================================
       
     4 
       
     5 .. module:: struct
       
     6    :synopsis: Interpret strings as packed binary data.
       
     7 
       
     8 .. index::
       
     9    pair: C; structures
       
    10    triple: packing; binary; data
       
    11 
       
    12 This module performs conversions between Python values and C structs represented
       
    13 as Python strings.  It uses :dfn:`format strings` (explained below) as compact
       
    14 descriptions of the lay-out of the C structs and the intended conversion to/from
       
    15 Python values.  This can be used in handling binary data stored in files or from
       
    16 network connections, among other sources.
       
    17 
       
    18 The module defines the following exception and functions:
       
    19 
       
    20 
       
    21 .. exception:: error
       
    22 
       
    23    Exception raised on various occasions; argument is a string describing what is
       
    24    wrong.
       
    25 
       
    26 
       
    27 .. function:: pack(fmt, v1, v2, ...)
       
    28 
       
    29    Return a string containing the values ``v1, v2, ...`` packed according to the
       
    30    given format.  The arguments must match the values required by the format
       
    31    exactly.
       
    32 
       
    33 
       
    34 .. function:: pack_into(fmt, buffer, offset, v1, v2, ...)
       
    35 
       
    36    Pack the values ``v1, v2, ...`` according to the given format, write the packed
       
    37    bytes into the writable *buffer* starting at *offset*. Note that the offset is
       
    38    a required argument.
       
    39 
       
    40    .. versionadded:: 2.5
       
    41 
       
    42 
       
    43 .. function:: unpack(fmt, string)
       
    44 
       
    45    Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the
       
    46    given format.  The result is a tuple even if it contains exactly one item.  The
       
    47    string must contain exactly the amount of data required by the format
       
    48    (``len(string)`` must equal ``calcsize(fmt)``).
       
    49 
       
    50 
       
    51 .. function:: unpack_from(fmt, buffer[,offset=0])
       
    52 
       
    53    Unpack the *buffer* according to tthe given format. The result is a tuple even
       
    54    if it contains exactly one item. The *buffer* must contain at least the amount
       
    55    of data required by the format (``len(buffer[offset:])`` must be at least
       
    56    ``calcsize(fmt)``).
       
    57 
       
    58    .. versionadded:: 2.5
       
    59 
       
    60 
       
    61 .. function:: calcsize(fmt)
       
    62 
       
    63    Return the size of the struct (and hence of the string) corresponding to the
       
    64    given format.
       
    65 
       
    66 Format characters have the following meaning; the conversion between C and
       
    67 Python values should be obvious given their types:
       
    68 
       
    69 +--------+-------------------------+--------------------+-------+
       
    70 | Format | C Type                  | Python             | Notes |
       
    71 +========+=========================+====================+=======+
       
    72 | ``x``  | pad byte                | no value           |       |
       
    73 +--------+-------------------------+--------------------+-------+
       
    74 | ``c``  | :ctype:`char`           | string of length 1 |       |
       
    75 +--------+-------------------------+--------------------+-------+
       
    76 | ``b``  | :ctype:`signed char`    | integer            |       |
       
    77 +--------+-------------------------+--------------------+-------+
       
    78 | ``B``  | :ctype:`unsigned char`  | integer            |       |
       
    79 +--------+-------------------------+--------------------+-------+
       
    80 | ``?``  | :ctype:`_Bool`          | bool               | \(1)  |
       
    81 +--------+-------------------------+--------------------+-------+
       
    82 | ``h``  | :ctype:`short`          | integer            |       |
       
    83 +--------+-------------------------+--------------------+-------+
       
    84 | ``H``  | :ctype:`unsigned short` | integer            |       |
       
    85 +--------+-------------------------+--------------------+-------+
       
    86 | ``i``  | :ctype:`int`            | integer            |       |
       
    87 +--------+-------------------------+--------------------+-------+
       
    88 | ``I``  | :ctype:`unsigned int`   | integer or long    |       |
       
    89 +--------+-------------------------+--------------------+-------+
       
    90 | ``l``  | :ctype:`long`           | integer            |       |
       
    91 +--------+-------------------------+--------------------+-------+
       
    92 | ``L``  | :ctype:`unsigned long`  | long               |       |
       
    93 +--------+-------------------------+--------------------+-------+
       
    94 | ``q``  | :ctype:`long long`      | long               | \(2)  |
       
    95 +--------+-------------------------+--------------------+-------+
       
    96 | ``Q``  | :ctype:`unsigned long   | long               | \(2)  |
       
    97 |        | long`                   |                    |       |
       
    98 +--------+-------------------------+--------------------+-------+
       
    99 | ``f``  | :ctype:`float`          | float              |       |
       
   100 +--------+-------------------------+--------------------+-------+
       
   101 | ``d``  | :ctype:`double`         | float              |       |
       
   102 +--------+-------------------------+--------------------+-------+
       
   103 | ``s``  | :ctype:`char[]`         | string             |       |
       
   104 +--------+-------------------------+--------------------+-------+
       
   105 | ``p``  | :ctype:`char[]`         | string             |       |
       
   106 +--------+-------------------------+--------------------+-------+
       
   107 | ``P``  | :ctype:`void \*`        | long               |       |
       
   108 +--------+-------------------------+--------------------+-------+
       
   109 
       
   110 Notes:
       
   111 
       
   112 (1)
       
   113    The ``'?'`` conversion code corresponds to the :ctype:`_Bool` type defined by
       
   114    C99. If this type is not available, it is simulated using a :ctype:`char`. In
       
   115    standard mode, it is always represented by one byte.
       
   116 
       
   117    .. versionadded:: 2.6
       
   118 
       
   119 (2)
       
   120    The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if
       
   121    the platform C compiler supports C :ctype:`long long`, or, on Windows,
       
   122    :ctype:`__int64`.  They are always available in standard modes.
       
   123 
       
   124    .. versionadded:: 2.2
       
   125 
       
   126 A format character may be preceded by an integral repeat count.  For example,
       
   127 the format string ``'4h'`` means exactly the same as ``'hhhh'``.
       
   128 
       
   129 Whitespace characters between formats are ignored; a count and its format must
       
   130 not contain whitespace though.
       
   131 
       
   132 For the ``'s'`` format character, the count is interpreted as the size of the
       
   133 string, not a repeat count like for the other format characters; for example,
       
   134 ``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters.
       
   135 For packing, the string is truncated or padded with null bytes as appropriate to
       
   136 make it fit. For unpacking, the resulting string always has exactly the
       
   137 specified number of bytes.  As a special case, ``'0s'`` means a single, empty
       
   138 string (while ``'0c'`` means 0 characters).
       
   139 
       
   140 The ``'p'`` format character encodes a "Pascal string", meaning a short
       
   141 variable-length string stored in a fixed number of bytes. The count is the total
       
   142 number of bytes stored.  The first byte stored is the length of the string, or
       
   143 255, whichever is smaller.  The bytes of the string follow.  If the string
       
   144 passed in to :func:`pack` is too long (longer than the count minus 1), only the
       
   145 leading count-1 bytes of the string are stored.  If the string is shorter than
       
   146 count-1, it is padded with null bytes so that exactly count bytes in all are
       
   147 used.  Note that for :func:`unpack`, the ``'p'`` format character consumes count
       
   148 bytes, but that the string returned can never contain more than 255 characters.
       
   149 
       
   150 For the ``'I'``, ``'L'``, ``'q'`` and ``'Q'`` format characters, the return
       
   151 value is a Python long integer.
       
   152 
       
   153 For the ``'P'`` format character, the return value is a Python integer or long
       
   154 integer, depending on the size needed to hold a pointer when it has been cast to
       
   155 an integer type.  A *NULL* pointer will always be returned as the Python integer
       
   156 ``0``. When packing pointer-sized values, Python integer or long integer objects
       
   157 may be used.  For example, the Alpha and Merced processors use 64-bit pointer
       
   158 values, meaning a Python long integer will be used to hold the pointer; other
       
   159 platforms use 32-bit pointers and will use a Python integer.
       
   160 
       
   161 For the ``'?'`` format character, the return value is either :const:`True` or
       
   162 :const:`False`. When packing, the truth value of the argument object is used.
       
   163 Either 0 or 1 in the native or standard bool representation will be packed, and
       
   164 any non-zero value will be True when unpacking.
       
   165 
       
   166 By default, C numbers are represented in the machine's native format and byte
       
   167 order, and properly aligned by skipping pad bytes if necessary (according to the
       
   168 rules used by the C compiler).
       
   169 
       
   170 Alternatively, the first character of the format string can be used to indicate
       
   171 the byte order, size and alignment of the packed data, according to the
       
   172 following table:
       
   173 
       
   174 +-----------+------------------------+--------------------+
       
   175 | Character | Byte order             | Size and alignment |
       
   176 +===========+========================+====================+
       
   177 | ``@``     | native                 | native             |
       
   178 +-----------+------------------------+--------------------+
       
   179 | ``=``     | native                 | standard           |
       
   180 +-----------+------------------------+--------------------+
       
   181 | ``<``     | little-endian          | standard           |
       
   182 +-----------+------------------------+--------------------+
       
   183 | ``>``     | big-endian             | standard           |
       
   184 +-----------+------------------------+--------------------+
       
   185 | ``!``     | network (= big-endian) | standard           |
       
   186 +-----------+------------------------+--------------------+
       
   187 
       
   188 If the first character is not one of these, ``'@'`` is assumed.
       
   189 
       
   190 Native byte order is big-endian or little-endian, depending on the host system.
       
   191 For example, Motorola and Sun processors are big-endian; Intel and DEC
       
   192 processors are little-endian.
       
   193 
       
   194 Native size and alignment are determined using the C compiler's
       
   195 ``sizeof`` expression.  This is always combined with native byte order.
       
   196 
       
   197 Standard size and alignment are as follows: no alignment is required for any
       
   198 type (so you have to use pad bytes); :ctype:`short` is 2 bytes; :ctype:`int` and
       
   199 :ctype:`long` are 4 bytes; :ctype:`long long` (:ctype:`__int64` on Windows) is 8
       
   200 bytes; :ctype:`float` and :ctype:`double` are 32-bit and 64-bit IEEE floating
       
   201 point numbers, respectively. :ctype:`_Bool` is 1 byte.
       
   202 
       
   203 Note the difference between ``'@'`` and ``'='``: both use native byte order, but
       
   204 the size and alignment of the latter is standardized.
       
   205 
       
   206 The form ``'!'`` is available for those poor souls who claim they can't remember
       
   207 whether network byte order is big-endian or little-endian.
       
   208 
       
   209 There is no way to indicate non-native byte order (force byte-swapping); use the
       
   210 appropriate choice of ``'<'`` or ``'>'``.
       
   211 
       
   212 The ``'P'`` format character is only available for the native byte ordering
       
   213 (selected as the default or with the ``'@'`` byte order character). The byte
       
   214 order character ``'='`` chooses to use little- or big-endian ordering based on
       
   215 the host system. The struct module does not interpret this as native ordering,
       
   216 so the ``'P'`` format is not available.
       
   217 
       
   218 Examples (all using native byte order, size and alignment, on a big-endian
       
   219 machine)::
       
   220 
       
   221    >>> from struct import *
       
   222    >>> pack('hhl', 1, 2, 3)
       
   223    '\x00\x01\x00\x02\x00\x00\x00\x03'
       
   224    >>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
       
   225    (1, 2, 3)
       
   226    >>> calcsize('hhl')
       
   227    8
       
   228 
       
   229 Hint: to align the end of a structure to the alignment requirement of a
       
   230 particular type, end the format with the code for that type with a repeat count
       
   231 of zero.  For example, the format ``'llh0l'`` specifies two pad bytes at the
       
   232 end, assuming longs are aligned on 4-byte boundaries.  This only works when
       
   233 native size and alignment are in effect; standard size and alignment does not
       
   234 enforce any alignment.
       
   235 
       
   236 Unpacked fields can be named by assigning them to variables or by wrapping
       
   237 the result in a named tuple::
       
   238 
       
   239     >>> record = 'raymond   \x32\x12\x08\x01\x08'
       
   240     >>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)
       
   241 
       
   242     >>> from collections import namedtuple
       
   243     >>> Student = namedtuple('Student', 'name serialnum school gradelevel')
       
   244     >>> Student._make(unpack('<10sHHb', s))
       
   245     Student(name='raymond   ', serialnum=4658, school=264, gradelevel=8)
       
   246 
       
   247 .. seealso::
       
   248 
       
   249    Module :mod:`array`
       
   250       Packed binary storage of homogeneous data.
       
   251 
       
   252    Module :mod:`xdrlib`
       
   253       Packing and unpacking of XDR data.
       
   254 
       
   255 
       
   256 .. _struct-objects:
       
   257 
       
   258 Struct Objects
       
   259 --------------
       
   260 
       
   261 The :mod:`struct` module also defines the following type:
       
   262 
       
   263 
       
   264 .. class:: Struct(format)
       
   265 
       
   266    Return a new Struct object which writes and reads binary data according to the
       
   267    format string *format*.  Creating a Struct object once and calling its methods
       
   268    is more efficient than calling the :mod:`struct` functions with the same format
       
   269    since the format string only needs to be compiled once.
       
   270 
       
   271    .. versionadded:: 2.5
       
   272 
       
   273    Compiled Struct objects support the following methods and attributes:
       
   274 
       
   275 
       
   276    .. method:: pack(v1, v2, ...)
       
   277 
       
   278       Identical to the :func:`pack` function, using the compiled format.
       
   279       (``len(result)`` will equal :attr:`self.size`.)
       
   280 
       
   281 
       
   282    .. method:: pack_into(buffer, offset, v1, v2, ...)
       
   283 
       
   284       Identical to the :func:`pack_into` function, using the compiled format.
       
   285 
       
   286 
       
   287    .. method:: unpack(string)
       
   288 
       
   289       Identical to the :func:`unpack` function, using the compiled format.
       
   290       (``len(string)`` must equal :attr:`self.size`).
       
   291 
       
   292 
       
   293    .. method:: unpack_from(buffer[, offset=0])
       
   294 
       
   295       Identical to the :func:`unpack_from` function, using the compiled format.
       
   296       (``len(buffer[offset:])`` must be at least :attr:`self.size`).
       
   297 
       
   298 
       
   299    .. attribute:: format
       
   300 
       
   301       The format string used to construct this Struct object.
       
   302 
       
   303    .. attribute:: size
       
   304 
       
   305       The calculated size of the struct (and hence of the string) corresponding
       
   306       to :attr:`format`.
       
   307