symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/urllib2.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 :mod:`urllib2` --- extensible library for opening URLs
       
     2 ======================================================
       
     3 
       
     4 .. module:: urllib2
       
     5    :synopsis: Next generation URL opening library.
       
     6 .. moduleauthor:: Jeremy Hylton <jhylton@users.sourceforge.net>
       
     7 .. sectionauthor:: Moshe Zadka <moshez@users.sourceforge.net>
       
     8 
       
     9 
       
    10 .. note::
       
    11    The :mod:`urllib2` module has been split across several modules in
       
    12    Python 3.0 named :mod:`urllib.request` and :mod:`urllib.error`.
       
    13    The :term:`2to3` tool will automatically adapt imports when converting
       
    14    your sources to 3.0.
       
    15 
       
    16 
       
    17 The :mod:`urllib2` module defines functions and classes which help in opening
       
    18 URLs (mostly HTTP) in a complex world --- basic and digest authentication,
       
    19 redirections, cookies and more.
       
    20 
       
    21 The :mod:`urllib2` module defines the following functions:
       
    22 
       
    23 
       
    24 .. function:: urlopen(url[, data][, timeout])
       
    25 
       
    26    Open the URL *url*, which can be either a string or a :class:`Request` object.
       
    27 
       
    28    *data* may be a string specifying additional data to send to the server, or
       
    29    ``None`` if no such data is needed.  Currently HTTP requests are the only ones
       
    30    that use *data*; the HTTP request will be a POST instead of a GET when the
       
    31    *data* parameter is provided.  *data* should be a buffer in the standard
       
    32    :mimetype:`application/x-www-form-urlencoded` format.  The
       
    33    :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and
       
    34    returns a string in this format.
       
    35 
       
    36    The optional *timeout* parameter specifies a timeout in seconds for blocking
       
    37    operations like the connection attempt (if not specified, the global default
       
    38    timeout setting will be used).  This actually only works for HTTP, HTTPS,
       
    39    FTP and FTPS connections.
       
    40 
       
    41    This function returns a file-like object with two additional methods:
       
    42 
       
    43    * :meth:`geturl` --- return the URL of the resource retrieved, commonly used to
       
    44      determine if a redirect was followed
       
    45 
       
    46    * :meth:`info` --- return the meta-information of the page, such as headers, in
       
    47      the form of an ``httplib.HTTPMessage`` instance
       
    48      (see `Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_)
       
    49 
       
    50    Raises :exc:`URLError` on errors.
       
    51 
       
    52    Note that ``None`` may be returned if no handler handles the request (though the
       
    53    default installed global :class:`OpenerDirector` uses :class:`UnknownHandler` to
       
    54    ensure this never happens).
       
    55 
       
    56    .. versionchanged:: 2.6
       
    57       *timeout* was added.
       
    58 
       
    59 
       
    60 .. function:: install_opener(opener)
       
    61 
       
    62    Install an :class:`OpenerDirector` instance as the default global opener.
       
    63    Installing an opener is only necessary if you want urlopen to use that opener;
       
    64    otherwise, simply call :meth:`OpenerDirector.open` instead of :func:`urlopen`.
       
    65    The code does not check for a real :class:`OpenerDirector`, and any class with
       
    66    the appropriate interface will work.
       
    67 
       
    68 
       
    69 .. function:: build_opener([handler, ...])
       
    70 
       
    71    Return an :class:`OpenerDirector` instance, which chains the handlers in the
       
    72    order given. *handler*\s can be either instances of :class:`BaseHandler`, or
       
    73    subclasses of :class:`BaseHandler` (in which case it must be possible to call
       
    74    the constructor without any parameters).  Instances of the following classes
       
    75    will be in front of the *handler*\s, unless the *handler*\s contain them,
       
    76    instances of them or subclasses of them: :class:`ProxyHandler`,
       
    77    :class:`UnknownHandler`, :class:`HTTPHandler`, :class:`HTTPDefaultErrorHandler`,
       
    78    :class:`HTTPRedirectHandler`, :class:`FTPHandler`, :class:`FileHandler`,
       
    79    :class:`HTTPErrorProcessor`.
       
    80 
       
    81    If the Python installation has SSL support (i.e., if the :mod:`ssl` module can be imported),
       
    82    :class:`HTTPSHandler` will also be added.
       
    83 
       
    84    Beginning in Python 2.3, a :class:`BaseHandler` subclass may also change its
       
    85    :attr:`handler_order` member variable to modify its position in the handlers
       
    86    list.
       
    87 
       
    88 The following exceptions are raised as appropriate:
       
    89 
       
    90 
       
    91 .. exception:: URLError
       
    92 
       
    93    The handlers raise this exception (or derived exceptions) when they run into a
       
    94    problem.  It is a subclass of :exc:`IOError`.
       
    95 
       
    96    .. attribute:: reason
       
    97 
       
    98       The reason for this error.  It can be a message string or another exception
       
    99       instance (:exc:`socket.error` for remote URLs, :exc:`OSError` for local
       
   100       URLs).
       
   101 
       
   102 
       
   103 .. exception:: HTTPError
       
   104 
       
   105    Though being an exception (a subclass of :exc:`URLError`), an :exc:`HTTPError`
       
   106    can also function as a non-exceptional file-like return value (the same thing
       
   107    that :func:`urlopen` returns).  This is useful when handling exotic HTTP
       
   108    errors, such as requests for authentication.
       
   109 
       
   110    .. attribute:: code
       
   111 
       
   112       An HTTP status code as defined in `RFC 2616 <http://www.faqs.org/rfcs/rfc2616.html>`_. 
       
   113       This numeric value corresponds to a value found in the dictionary of
       
   114       codes as found in :attr:`BaseHTTPServer.BaseHTTPRequestHandler.responses`.
       
   115 
       
   116 
       
   117 
       
   118 The following classes are provided:
       
   119 
       
   120 
       
   121 .. class:: Request(url[, data][, headers][, origin_req_host][, unverifiable])
       
   122 
       
   123    This class is an abstraction of a URL request.
       
   124 
       
   125    *url* should be a string containing a valid URL.
       
   126 
       
   127    *data* may be a string specifying additional data to send to the server, or
       
   128    ``None`` if no such data is needed.  Currently HTTP requests are the only ones
       
   129    that use *data*; the HTTP request will be a POST instead of a GET when the
       
   130    *data* parameter is provided.  *data* should be a buffer in the standard
       
   131    :mimetype:`application/x-www-form-urlencoded` format.  The
       
   132    :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and
       
   133    returns a string in this format.
       
   134 
       
   135    *headers* should be a dictionary, and will be treated as if :meth:`add_header`
       
   136    was called with each key and value as arguments.  This is often used to "spoof"
       
   137    the ``User-Agent`` header, which is used by a browser to identify itself --
       
   138    some HTTP servers only allow requests coming from common browsers as opposed
       
   139    to scripts.  For example, Mozilla Firefox may identify itself as ``"Mozilla/5.0
       
   140    (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"``, while :mod:`urllib2`'s
       
   141    default user agent string is ``"Python-urllib/2.6"`` (on Python 2.6).
       
   142 
       
   143    The final two arguments are only of interest for correct handling of third-party
       
   144    HTTP cookies:
       
   145 
       
   146    *origin_req_host* should be the request-host of the origin transaction, as
       
   147    defined by :rfc:`2965`.  It defaults to ``cookielib.request_host(self)``.  This
       
   148    is the host name or IP address of the original request that was initiated by the
       
   149    user.  For example, if the request is for an image in an HTML document, this
       
   150    should be the request-host of the request for the page containing the image.
       
   151 
       
   152    *unverifiable* should indicate whether the request is unverifiable, as defined
       
   153    by RFC 2965.  It defaults to False.  An unverifiable request is one whose URL
       
   154    the user did not have the option to approve.  For example, if the request is for
       
   155    an image in an HTML document, and the user had no option to approve the
       
   156    automatic fetching of the image, this should be true.
       
   157 
       
   158 
       
   159 .. class:: OpenerDirector()
       
   160 
       
   161    The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
       
   162    together. It manages the chaining of handlers, and recovery from errors.
       
   163 
       
   164 
       
   165 .. class:: BaseHandler()
       
   166 
       
   167    This is the base class for all registered handlers --- and handles only the
       
   168    simple mechanics of registration.
       
   169 
       
   170 
       
   171 .. class:: HTTPDefaultErrorHandler()
       
   172 
       
   173    A class which defines a default handler for HTTP error responses; all responses
       
   174    are turned into :exc:`HTTPError` exceptions.
       
   175 
       
   176 
       
   177 .. class:: HTTPRedirectHandler()
       
   178 
       
   179    A class to handle redirections.
       
   180 
       
   181 
       
   182 .. class:: HTTPCookieProcessor([cookiejar])
       
   183 
       
   184    A class to handle HTTP Cookies.
       
   185 
       
   186 
       
   187 .. class:: ProxyHandler([proxies])
       
   188 
       
   189    Cause requests to go through a proxy. If *proxies* is given, it must be a
       
   190    dictionary mapping protocol names to URLs of proxies. The default is to read the
       
   191    list of proxies from the environment variables :envvar:`<protocol>_proxy`.
       
   192    To disable autodetected proxy pass an empty dictionary.
       
   193 
       
   194 
       
   195 .. class:: HTTPPasswordMgr()
       
   196 
       
   197    Keep a database of  ``(realm, uri) -> (user, password)`` mappings.
       
   198 
       
   199 
       
   200 .. class:: HTTPPasswordMgrWithDefaultRealm()
       
   201 
       
   202    Keep a database of  ``(realm, uri) -> (user, password)`` mappings. A realm of
       
   203    ``None`` is considered a catch-all realm, which is searched if no other realm
       
   204    fits.
       
   205 
       
   206 
       
   207 .. class:: AbstractBasicAuthHandler([password_mgr])
       
   208 
       
   209    This is a mixin class that helps with HTTP authentication, both to the remote
       
   210    host and to a proxy. *password_mgr*, if given, should be something that is
       
   211    compatible with :class:`HTTPPasswordMgr`; refer to section
       
   212    :ref:`http-password-mgr` for information on the interface that must be
       
   213    supported.
       
   214 
       
   215 
       
   216 .. class:: HTTPBasicAuthHandler([password_mgr])
       
   217 
       
   218    Handle authentication with the remote host. *password_mgr*, if given, should be
       
   219    something that is compatible with :class:`HTTPPasswordMgr`; refer to section
       
   220    :ref:`http-password-mgr` for information on the interface that must be
       
   221    supported.
       
   222 
       
   223 
       
   224 .. class:: ProxyBasicAuthHandler([password_mgr])
       
   225 
       
   226    Handle authentication with the proxy. *password_mgr*, if given, should be
       
   227    something that is compatible with :class:`HTTPPasswordMgr`; refer to section
       
   228    :ref:`http-password-mgr` for information on the interface that must be
       
   229    supported.
       
   230 
       
   231 
       
   232 .. class:: AbstractDigestAuthHandler([password_mgr])
       
   233 
       
   234    This is a mixin class that helps with HTTP authentication, both to the remote
       
   235    host and to a proxy. *password_mgr*, if given, should be something that is
       
   236    compatible with :class:`HTTPPasswordMgr`; refer to section
       
   237    :ref:`http-password-mgr` for information on the interface that must be
       
   238    supported.
       
   239 
       
   240 
       
   241 .. class:: HTTPDigestAuthHandler([password_mgr])
       
   242 
       
   243    Handle authentication with the remote host. *password_mgr*, if given, should be
       
   244    something that is compatible with :class:`HTTPPasswordMgr`; refer to section
       
   245    :ref:`http-password-mgr` for information on the interface that must be
       
   246    supported.
       
   247 
       
   248 
       
   249 .. class:: ProxyDigestAuthHandler([password_mgr])
       
   250 
       
   251    Handle authentication with the proxy. *password_mgr*, if given, should be
       
   252    something that is compatible with :class:`HTTPPasswordMgr`; refer to section
       
   253    :ref:`http-password-mgr` for information on the interface that must be
       
   254    supported.
       
   255 
       
   256 
       
   257 .. class:: HTTPHandler()
       
   258 
       
   259    A class to handle opening of HTTP URLs.
       
   260 
       
   261 
       
   262 .. class:: HTTPSHandler()
       
   263 
       
   264    A class to handle opening of HTTPS URLs.
       
   265 
       
   266 
       
   267 .. class:: FileHandler()
       
   268 
       
   269    Open local files.
       
   270 
       
   271 
       
   272 .. class:: FTPHandler()
       
   273 
       
   274    Open FTP URLs.
       
   275 
       
   276 
       
   277 .. class:: CacheFTPHandler()
       
   278 
       
   279    Open FTP URLs, keeping a cache of open FTP connections to minimize delays.
       
   280 
       
   281 
       
   282 .. class:: UnknownHandler()
       
   283 
       
   284    A catch-all class to handle unknown URLs.
       
   285 
       
   286 
       
   287 .. _request-objects:
       
   288 
       
   289 Request Objects
       
   290 ---------------
       
   291 
       
   292 The following methods describe all of :class:`Request`'s public interface, and
       
   293 so all must be overridden in subclasses.
       
   294 
       
   295 
       
   296 .. method:: Request.add_data(data)
       
   297 
       
   298    Set the :class:`Request` data to *data*.  This is ignored by all handlers except
       
   299    HTTP handlers --- and there it should be a byte string, and will change the
       
   300    request to be ``POST`` rather than ``GET``.
       
   301 
       
   302 
       
   303 .. method:: Request.get_method()
       
   304 
       
   305    Return a string indicating the HTTP request method.  This is only meaningful for
       
   306    HTTP requests, and currently always returns ``'GET'`` or ``'POST'``.
       
   307 
       
   308 
       
   309 .. method:: Request.has_data()
       
   310 
       
   311    Return whether the instance has a non-\ ``None`` data.
       
   312 
       
   313 
       
   314 .. method:: Request.get_data()
       
   315 
       
   316    Return the instance's data.
       
   317 
       
   318 
       
   319 .. method:: Request.add_header(key, val)
       
   320 
       
   321    Add another header to the request.  Headers are currently ignored by all
       
   322    handlers except HTTP handlers, where they are added to the list of headers sent
       
   323    to the server.  Note that there cannot be more than one header with the same
       
   324    name, and later calls will overwrite previous calls in case the *key* collides.
       
   325    Currently, this is no loss of HTTP functionality, since all headers which have
       
   326    meaning when used more than once have a (header-specific) way of gaining the
       
   327    same functionality using only one header.
       
   328 
       
   329 
       
   330 .. method:: Request.add_unredirected_header(key, header)
       
   331 
       
   332    Add a header that will not be added to a redirected request.
       
   333 
       
   334    .. versionadded:: 2.4
       
   335 
       
   336 
       
   337 .. method:: Request.has_header(header)
       
   338 
       
   339    Return whether the instance has the named header (checks both regular and
       
   340    unredirected).
       
   341 
       
   342    .. versionadded:: 2.4
       
   343 
       
   344 
       
   345 .. method:: Request.get_full_url()
       
   346 
       
   347    Return the URL given in the constructor.
       
   348 
       
   349 
       
   350 .. method:: Request.get_type()
       
   351 
       
   352    Return the type of the URL --- also known as the scheme.
       
   353 
       
   354 
       
   355 .. method:: Request.get_host()
       
   356 
       
   357    Return the host to which a connection will be made.
       
   358 
       
   359 
       
   360 .. method:: Request.get_selector()
       
   361 
       
   362    Return the selector --- the part of the URL that is sent to the server.
       
   363 
       
   364 
       
   365 .. method:: Request.set_proxy(host, type)
       
   366 
       
   367    Prepare the request by connecting to a proxy server. The *host* and *type* will
       
   368    replace those of the instance, and the instance's selector will be the original
       
   369    URL given in the constructor.
       
   370 
       
   371 
       
   372 .. method:: Request.get_origin_req_host()
       
   373 
       
   374    Return the request-host of the origin transaction, as defined by :rfc:`2965`.
       
   375    See the documentation for the :class:`Request` constructor.
       
   376 
       
   377 
       
   378 .. method:: Request.is_unverifiable()
       
   379 
       
   380    Return whether the request is unverifiable, as defined by RFC 2965. See the
       
   381    documentation for the :class:`Request` constructor.
       
   382 
       
   383 
       
   384 .. _opener-director-objects:
       
   385 
       
   386 OpenerDirector Objects
       
   387 ----------------------
       
   388 
       
   389 :class:`OpenerDirector` instances have the following methods:
       
   390 
       
   391 
       
   392 .. method:: OpenerDirector.add_handler(handler)
       
   393 
       
   394    *handler* should be an instance of :class:`BaseHandler`.  The following methods
       
   395    are searched, and added to the possible chains (note that HTTP errors are a
       
   396    special case).
       
   397 
       
   398    * :meth:`protocol_open` --- signal that the handler knows how to open *protocol*
       
   399      URLs.
       
   400 
       
   401    * :meth:`http_error_type` --- signal that the handler knows how to handle HTTP
       
   402      errors with HTTP error code *type*.
       
   403 
       
   404    * :meth:`protocol_error` --- signal that the handler knows how to handle errors
       
   405      from (non-\ ``http``) *protocol*.
       
   406 
       
   407    * :meth:`protocol_request` --- signal that the handler knows how to pre-process
       
   408      *protocol* requests.
       
   409 
       
   410    * :meth:`protocol_response` --- signal that the handler knows how to
       
   411      post-process *protocol* responses.
       
   412 
       
   413 
       
   414 .. method:: OpenerDirector.open(url[, data][, timeout])
       
   415 
       
   416    Open the given *url* (which can be a request object or a string), optionally
       
   417    passing the given *data*. Arguments, return values and exceptions raised are
       
   418    the same as those of :func:`urlopen` (which simply calls the :meth:`open`
       
   419    method on the currently installed global :class:`OpenerDirector`).  The
       
   420    optional *timeout* parameter specifies a timeout in seconds for blocking
       
   421    operations like the connection attempt (if not specified, the global default
       
   422    timeout setting will be usedi). The timeout feature actually works only for
       
   423    HTTP, HTTPS, FTP and FTPS connections).
       
   424 
       
   425    .. versionchanged:: 2.6
       
   426       *timeout* was added.
       
   427 
       
   428 
       
   429 .. method:: OpenerDirector.error(proto[, arg[, ...]])
       
   430 
       
   431    Handle an error of the given protocol.  This will call the registered error
       
   432    handlers for the given protocol with the given arguments (which are protocol
       
   433    specific).  The HTTP protocol is a special case which uses the HTTP response
       
   434    code to determine the specific error handler; refer to the :meth:`http_error_\*`
       
   435    methods of the handler classes.
       
   436 
       
   437    Return values and exceptions raised are the same as those of :func:`urlopen`.
       
   438 
       
   439 OpenerDirector objects open URLs in three stages:
       
   440 
       
   441 The order in which these methods are called within each stage is determined by
       
   442 sorting the handler instances.
       
   443 
       
   444 #. Every handler with a method named like :meth:`protocol_request` has that
       
   445    method called to pre-process the request.
       
   446 
       
   447 #. Handlers with a method named like :meth:`protocol_open` are called to handle
       
   448    the request. This stage ends when a handler either returns a non-\ :const:`None`
       
   449    value (ie. a response), or raises an exception (usually :exc:`URLError`).
       
   450    Exceptions are allowed to propagate.
       
   451 
       
   452    In fact, the above algorithm is first tried for methods named
       
   453    :meth:`default_open`.  If all such methods return :const:`None`, the algorithm
       
   454    is repeated for methods named like :meth:`protocol_open`.  If all such methods
       
   455    return :const:`None`, the algorithm is repeated for methods named
       
   456    :meth:`unknown_open`.
       
   457 
       
   458    Note that the implementation of these methods may involve calls of the parent
       
   459    :class:`OpenerDirector` instance's :meth:`.open` and :meth:`.error` methods.
       
   460 
       
   461 #. Every handler with a method named like :meth:`protocol_response` has that
       
   462    method called to post-process the response.
       
   463 
       
   464 
       
   465 .. _base-handler-objects:
       
   466 
       
   467 BaseHandler Objects
       
   468 -------------------
       
   469 
       
   470 :class:`BaseHandler` objects provide a couple of methods that are directly
       
   471 useful, and others that are meant to be used by derived classes.  These are
       
   472 intended for direct use:
       
   473 
       
   474 
       
   475 .. method:: BaseHandler.add_parent(director)
       
   476 
       
   477    Add a director as parent.
       
   478 
       
   479 
       
   480 .. method:: BaseHandler.close()
       
   481 
       
   482    Remove any parents.
       
   483 
       
   484 The following members and methods should only be used by classes derived from
       
   485 :class:`BaseHandler`.
       
   486 
       
   487 .. note::
       
   488 
       
   489    The convention has been adopted that subclasses defining
       
   490    :meth:`protocol_request` or :meth:`protocol_response` methods are named
       
   491    :class:`\*Processor`; all others are named :class:`\*Handler`.
       
   492 
       
   493 
       
   494 .. attribute:: BaseHandler.parent
       
   495 
       
   496    A valid :class:`OpenerDirector`, which can be used to open using a different
       
   497    protocol, or handle errors.
       
   498 
       
   499 
       
   500 .. method:: BaseHandler.default_open(req)
       
   501 
       
   502    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   503    define it if they want to catch all URLs.
       
   504 
       
   505    This method, if implemented, will be called by the parent
       
   506    :class:`OpenerDirector`.  It should return a file-like object as described in
       
   507    the return value of the :meth:`open` of :class:`OpenerDirector`, or ``None``.
       
   508    It should raise :exc:`URLError`, unless a truly exceptional thing happens (for
       
   509    example, :exc:`MemoryError` should not be mapped to :exc:`URLError`).
       
   510 
       
   511    This method will be called before any protocol-specific open method.
       
   512 
       
   513 
       
   514 .. method:: BaseHandler.protocol_open(req)
       
   515    :noindex:
       
   516 
       
   517    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   518    define it if they want to handle URLs with the given protocol.
       
   519 
       
   520    This method, if defined, will be called by the parent :class:`OpenerDirector`.
       
   521    Return values should be the same as for  :meth:`default_open`.
       
   522 
       
   523 
       
   524 .. method:: BaseHandler.unknown_open(req)
       
   525 
       
   526    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   527    define it if they want to catch all URLs with no specific registered handler to
       
   528    open it.
       
   529 
       
   530    This method, if implemented, will be called by the :attr:`parent`
       
   531    :class:`OpenerDirector`.  Return values should be the same as for
       
   532    :meth:`default_open`.
       
   533 
       
   534 
       
   535 .. method:: BaseHandler.http_error_default(req, fp, code, msg, hdrs)
       
   536 
       
   537    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   538    override it if they intend to provide a catch-all for otherwise unhandled HTTP
       
   539    errors.  It will be called automatically by the  :class:`OpenerDirector` getting
       
   540    the error, and should not normally be called in other circumstances.
       
   541 
       
   542    *req* will be a :class:`Request` object, *fp* will be a file-like object with
       
   543    the HTTP error body, *code* will be the three-digit code of the error, *msg*
       
   544    will be the user-visible explanation of the code and *hdrs* will be a mapping
       
   545    object with the headers of the error.
       
   546 
       
   547    Return values and exceptions raised should be the same as those of
       
   548    :func:`urlopen`.
       
   549 
       
   550 
       
   551 .. method:: BaseHandler.http_error_nnn(req, fp, code, msg, hdrs)
       
   552 
       
   553    *nnn* should be a three-digit HTTP error code.  This method is also not defined
       
   554    in :class:`BaseHandler`, but will be called, if it exists, on an instance of a
       
   555    subclass, when an HTTP error with code *nnn* occurs.
       
   556 
       
   557    Subclasses should override this method to handle specific HTTP errors.
       
   558 
       
   559    Arguments, return values and exceptions raised should be the same as for
       
   560    :meth:`http_error_default`.
       
   561 
       
   562 
       
   563 .. method:: BaseHandler.protocol_request(req)
       
   564    :noindex:
       
   565 
       
   566    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   567    define it if they want to pre-process requests of the given protocol.
       
   568 
       
   569    This method, if defined, will be called by the parent :class:`OpenerDirector`.
       
   570    *req* will be a :class:`Request` object. The return value should be a
       
   571    :class:`Request` object.
       
   572 
       
   573 
       
   574 .. method:: BaseHandler.protocol_response(req, response)
       
   575    :noindex:
       
   576 
       
   577    This method is *not* defined in :class:`BaseHandler`, but subclasses should
       
   578    define it if they want to post-process responses of the given protocol.
       
   579 
       
   580    This method, if defined, will be called by the parent :class:`OpenerDirector`.
       
   581    *req* will be a :class:`Request` object. *response* will be an object
       
   582    implementing the same interface as the return value of :func:`urlopen`.  The
       
   583    return value should implement the same interface as the return value of
       
   584    :func:`urlopen`.
       
   585 
       
   586 
       
   587 .. _http-redirect-handler:
       
   588 
       
   589 HTTPRedirectHandler Objects
       
   590 ---------------------------
       
   591 
       
   592 .. note::
       
   593 
       
   594    Some HTTP redirections require action from this module's client code.  If this
       
   595    is the case, :exc:`HTTPError` is raised.  See :rfc:`2616` for details of the
       
   596    precise meanings of the various redirection codes.
       
   597 
       
   598 
       
   599 .. method:: HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs)
       
   600 
       
   601    Return a :class:`Request` or ``None`` in response to a redirect. This is called
       
   602    by the default implementations of the :meth:`http_error_30\*` methods when a
       
   603    redirection is received from the server.  If a redirection should take place,
       
   604    return a new :class:`Request` to allow :meth:`http_error_30\*` to perform the
       
   605    redirect.  Otherwise, raise :exc:`HTTPError` if no other handler should try to
       
   606    handle this URL, or return ``None`` if you can't but another handler might.
       
   607 
       
   608    .. note::
       
   609 
       
   610       The default implementation of this method does not strictly follow :rfc:`2616`,
       
   611       which says that 301 and 302 responses to ``POST`` requests must not be
       
   612       automatically redirected without confirmation by the user.  In reality, browsers
       
   613       do allow automatic redirection of these responses, changing the POST to a
       
   614       ``GET``, and the default implementation reproduces this behavior.
       
   615 
       
   616 
       
   617 .. method:: HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs)
       
   618 
       
   619    Redirect to the ``Location:`` URL.  This method is called by the parent
       
   620    :class:`OpenerDirector` when getting an HTTP 'moved permanently' response.
       
   621 
       
   622 
       
   623 .. method:: HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs)
       
   624 
       
   625    The same as :meth:`http_error_301`, but called for the 'found' response.
       
   626 
       
   627 
       
   628 .. method:: HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs)
       
   629 
       
   630    The same as :meth:`http_error_301`, but called for the 'see other' response.
       
   631 
       
   632 
       
   633 .. method:: HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs)
       
   634 
       
   635    The same as :meth:`http_error_301`, but called for the 'temporary redirect'
       
   636    response.
       
   637 
       
   638 
       
   639 .. _http-cookie-processor:
       
   640 
       
   641 HTTPCookieProcessor Objects
       
   642 ---------------------------
       
   643 
       
   644 .. versionadded:: 2.4
       
   645 
       
   646 :class:`HTTPCookieProcessor` instances have one attribute:
       
   647 
       
   648 
       
   649 .. attribute:: HTTPCookieProcessor.cookiejar
       
   650 
       
   651    The :class:`cookielib.CookieJar` in which cookies are stored.
       
   652 
       
   653 
       
   654 .. _proxy-handler:
       
   655 
       
   656 ProxyHandler Objects
       
   657 --------------------
       
   658 
       
   659 
       
   660 .. method:: ProxyHandler.protocol_open(request)
       
   661    :noindex:
       
   662 
       
   663    The :class:`ProxyHandler` will have a method :meth:`protocol_open` for every
       
   664    *protocol* which has a proxy in the *proxies* dictionary given in the
       
   665    constructor.  The method will modify requests to go through the proxy, by
       
   666    calling ``request.set_proxy()``, and call the next handler in the chain to
       
   667    actually execute the protocol.
       
   668 
       
   669 
       
   670 .. _http-password-mgr:
       
   671 
       
   672 HTTPPasswordMgr Objects
       
   673 -----------------------
       
   674 
       
   675 These methods are available on :class:`HTTPPasswordMgr` and
       
   676 :class:`HTTPPasswordMgrWithDefaultRealm` objects.
       
   677 
       
   678 
       
   679 .. method:: HTTPPasswordMgr.add_password(realm, uri, user, passwd)
       
   680 
       
   681    *uri* can be either a single URI, or a sequence of URIs. *realm*, *user* and
       
   682    *passwd* must be strings. This causes ``(user, passwd)`` to be used as
       
   683    authentication tokens when authentication for *realm* and a super-URI of any of
       
   684    the given URIs is given.
       
   685 
       
   686 
       
   687 .. method:: HTTPPasswordMgr.find_user_password(realm, authuri)
       
   688 
       
   689    Get user/password for given realm and URI, if any.  This method will return
       
   690    ``(None, None)`` if there is no matching user/password.
       
   691 
       
   692    For :class:`HTTPPasswordMgrWithDefaultRealm` objects, the realm ``None`` will be
       
   693    searched if the given *realm* has no matching user/password.
       
   694 
       
   695 
       
   696 .. _abstract-basic-auth-handler:
       
   697 
       
   698 AbstractBasicAuthHandler Objects
       
   699 --------------------------------
       
   700 
       
   701 
       
   702 .. method:: AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
       
   703 
       
   704    Handle an authentication request by getting a user/password pair, and re-trying
       
   705    the request.  *authreq* should be the name of the header where the information
       
   706    about the realm is included in the request, *host* specifies the URL and path to
       
   707    authenticate for, *req* should be the (failed) :class:`Request` object, and
       
   708    *headers* should be the error headers.
       
   709 
       
   710    *host* is either an authority (e.g. ``"python.org"``) or a URL containing an
       
   711    authority component (e.g. ``"http://python.org/"``). In either case, the
       
   712    authority must not contain a userinfo component (so, ``"python.org"`` and
       
   713    ``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not).
       
   714 
       
   715 
       
   716 .. _http-basic-auth-handler:
       
   717 
       
   718 HTTPBasicAuthHandler Objects
       
   719 ----------------------------
       
   720 
       
   721 
       
   722 .. method:: HTTPBasicAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
       
   723 
       
   724    Retry the request with authentication information, if available.
       
   725 
       
   726 
       
   727 .. _proxy-basic-auth-handler:
       
   728 
       
   729 ProxyBasicAuthHandler Objects
       
   730 -----------------------------
       
   731 
       
   732 
       
   733 .. method:: ProxyBasicAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
       
   734 
       
   735    Retry the request with authentication information, if available.
       
   736 
       
   737 
       
   738 .. _abstract-digest-auth-handler:
       
   739 
       
   740 AbstractDigestAuthHandler Objects
       
   741 ---------------------------------
       
   742 
       
   743 
       
   744 .. method:: AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
       
   745 
       
   746    *authreq* should be the name of the header where the information about the realm
       
   747    is included in the request, *host* should be the host to authenticate to, *req*
       
   748    should be the (failed) :class:`Request` object, and *headers* should be the
       
   749    error headers.
       
   750 
       
   751 
       
   752 .. _http-digest-auth-handler:
       
   753 
       
   754 HTTPDigestAuthHandler Objects
       
   755 -----------------------------
       
   756 
       
   757 
       
   758 .. method:: HTTPDigestAuthHandler.http_error_401(req, fp, code,  msg, hdrs)
       
   759 
       
   760    Retry the request with authentication information, if available.
       
   761 
       
   762 
       
   763 .. _proxy-digest-auth-handler:
       
   764 
       
   765 ProxyDigestAuthHandler Objects
       
   766 ------------------------------
       
   767 
       
   768 
       
   769 .. method:: ProxyDigestAuthHandler.http_error_407(req, fp, code,  msg, hdrs)
       
   770 
       
   771    Retry the request with authentication information, if available.
       
   772 
       
   773 
       
   774 .. _http-handler-objects:
       
   775 
       
   776 HTTPHandler Objects
       
   777 -------------------
       
   778 
       
   779 
       
   780 .. method:: HTTPHandler.http_open(req)
       
   781 
       
   782    Send an HTTP request, which can be either GET or POST, depending on
       
   783    ``req.has_data()``.
       
   784 
       
   785 
       
   786 .. _https-handler-objects:
       
   787 
       
   788 HTTPSHandler Objects
       
   789 --------------------
       
   790 
       
   791 
       
   792 .. method:: HTTPSHandler.https_open(req)
       
   793 
       
   794    Send an HTTPS request, which can be either GET or POST, depending on
       
   795    ``req.has_data()``.
       
   796 
       
   797 
       
   798 .. _file-handler-objects:
       
   799 
       
   800 FileHandler Objects
       
   801 -------------------
       
   802 
       
   803 
       
   804 .. method:: FileHandler.file_open(req)
       
   805 
       
   806    Open the file locally, if there is no host name, or the host name is
       
   807    ``'localhost'``. Change the protocol to ``ftp`` otherwise, and retry opening it
       
   808    using :attr:`parent`.
       
   809 
       
   810 
       
   811 .. _ftp-handler-objects:
       
   812 
       
   813 FTPHandler Objects
       
   814 ------------------
       
   815 
       
   816 
       
   817 .. method:: FTPHandler.ftp_open(req)
       
   818 
       
   819    Open the FTP file indicated by *req*. The login is always done with empty
       
   820    username and password.
       
   821 
       
   822 
       
   823 .. _cacheftp-handler-objects:
       
   824 
       
   825 CacheFTPHandler Objects
       
   826 -----------------------
       
   827 
       
   828 :class:`CacheFTPHandler` objects are :class:`FTPHandler` objects with the
       
   829 following additional methods:
       
   830 
       
   831 
       
   832 .. method:: CacheFTPHandler.setTimeout(t)
       
   833 
       
   834    Set timeout of connections to *t* seconds.
       
   835 
       
   836 
       
   837 .. method:: CacheFTPHandler.setMaxConns(m)
       
   838 
       
   839    Set maximum number of cached connections to *m*.
       
   840 
       
   841 
       
   842 .. _unknown-handler-objects:
       
   843 
       
   844 UnknownHandler Objects
       
   845 ----------------------
       
   846 
       
   847 
       
   848 .. method:: UnknownHandler.unknown_open()
       
   849 
       
   850    Raise a :exc:`URLError` exception.
       
   851 
       
   852 
       
   853 .. _http-error-processor-objects:
       
   854 
       
   855 HTTPErrorProcessor Objects
       
   856 --------------------------
       
   857 
       
   858 .. versionadded:: 2.4
       
   859 
       
   860 
       
   861 .. method:: HTTPErrorProcessor.unknown_open()
       
   862 
       
   863    Process HTTP error responses.
       
   864 
       
   865    For 200 error codes, the response object is returned immediately.
       
   866 
       
   867    For non-200 error codes, this simply passes the job on to the
       
   868    :meth:`protocol_error_code` handler methods, via :meth:`OpenerDirector.error`.
       
   869    Eventually, :class:`urllib2.HTTPDefaultErrorHandler` will raise an
       
   870    :exc:`HTTPError` if no other handler handles the error.
       
   871 
       
   872 
       
   873 .. _urllib2-examples:
       
   874 
       
   875 Examples
       
   876 --------
       
   877 
       
   878 This example gets the python.org main page and displays the first 100 bytes of
       
   879 it::
       
   880 
       
   881    >>> import urllib2
       
   882    >>> f = urllib2.urlopen('http://www.python.org/')
       
   883    >>> print f.read(100)
       
   884    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
       
   885    <?xml-stylesheet href="./css/ht2html
       
   886 
       
   887 Here we are sending a data-stream to the stdin of a CGI and reading the data it
       
   888 returns to us. Note that this example will only work when the Python
       
   889 installation supports SSL. ::
       
   890 
       
   891    >>> import urllib2
       
   892    >>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi',
       
   893    ...                       data='This data is passed to stdin of the CGI')
       
   894    >>> f = urllib2.urlopen(req)
       
   895    >>> print f.read()
       
   896    Got Data: "This data is passed to stdin of the CGI"
       
   897 
       
   898 The code for the sample CGI used in the above example is::
       
   899 
       
   900    #!/usr/bin/env python
       
   901    import sys
       
   902    data = sys.stdin.read()
       
   903    print 'Content-type: text-plain\n\nGot Data: "%s"' % data
       
   904 
       
   905 Use of Basic HTTP Authentication::
       
   906 
       
   907    import urllib2
       
   908    # Create an OpenerDirector with support for Basic HTTP Authentication...
       
   909    auth_handler = urllib2.HTTPBasicAuthHandler()
       
   910    auth_handler.add_password(realm='PDQ Application',
       
   911                              uri='https://mahler:8092/site-updates.py',
       
   912                              user='klem',
       
   913                              passwd='kadidd!ehopper')
       
   914    opener = urllib2.build_opener(auth_handler)
       
   915    # ...and install it globally so it can be used with urlopen.
       
   916    urllib2.install_opener(opener)
       
   917    urllib2.urlopen('http://www.example.com/login.html')
       
   918 
       
   919 :func:`build_opener` provides many handlers by default, including a
       
   920 :class:`ProxyHandler`.  By default, :class:`ProxyHandler` uses the environment
       
   921 variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme
       
   922 involved.  For example, the :envvar:`http_proxy` environment variable is read to
       
   923 obtain the HTTP proxy's URL.
       
   924 
       
   925 This example replaces the default :class:`ProxyHandler` with one that uses
       
   926 programmatically-supplied proxy URLs, and adds proxy authorization support with
       
   927 :class:`ProxyBasicAuthHandler`. ::
       
   928 
       
   929    proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})
       
   930    proxy_auth_handler = urllib2.HTTPBasicAuthHandler()
       
   931    proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
       
   932 
       
   933    opener = build_opener(proxy_handler, proxy_auth_handler)
       
   934    # This time, rather than install the OpenerDirector, we use it directly:
       
   935    opener.open('http://www.example.com/login.html')
       
   936 
       
   937 Adding HTTP headers:
       
   938 
       
   939 Use the *headers* argument to the :class:`Request` constructor, or::
       
   940 
       
   941    import urllib2
       
   942    req = urllib2.Request('http://www.example.com/')
       
   943    req.add_header('Referer', 'http://www.python.org/')
       
   944    r = urllib2.urlopen(req)
       
   945 
       
   946 :class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
       
   947 every :class:`Request`.  To change this::
       
   948 
       
   949    import urllib2
       
   950    opener = urllib2.build_opener()
       
   951    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
       
   952    opener.open('http://www.example.com/')
       
   953 
       
   954 Also, remember that a few standard headers (:mailheader:`Content-Length`,
       
   955 :mailheader:`Content-Type` and :mailheader:`Host`) are added when the
       
   956 :class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`).
       
   957