symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/cgi.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 
       
     2 :mod:`cgi` --- Common Gateway Interface support.
       
     3 ================================================
       
     4 
       
     5 .. module:: cgi
       
     6    :synopsis: Helpers for running Python scripts via the Common Gateway Interface.
       
     7 
       
     8 
       
     9 .. index::
       
    10    pair: WWW; server
       
    11    pair: CGI; protocol
       
    12    pair: HTTP; protocol
       
    13    pair: MIME; headers
       
    14    single: URL
       
    15    single: Common Gateway Interface
       
    16 
       
    17 Support module for Common Gateway Interface (CGI) scripts.
       
    18 
       
    19 This module defines a number of utilities for use by CGI scripts written in
       
    20 Python.
       
    21 
       
    22 
       
    23 Introduction
       
    24 ------------
       
    25 
       
    26 .. _cgi-intro:
       
    27 
       
    28 A CGI script is invoked by an HTTP server, usually to process user input
       
    29 submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element.
       
    30 
       
    31 Most often, CGI scripts live in the server's special :file:`cgi-bin` directory.
       
    32 The HTTP server places all sorts of information about the request (such as the
       
    33 client's hostname, the requested URL, the query string, and lots of other
       
    34 goodies) in the script's shell environment, executes the script, and sends the
       
    35 script's output back to the client.
       
    36 
       
    37 The script's input is connected to the client too, and sometimes the form data
       
    38 is read this way; at other times the form data is passed via the "query string"
       
    39 part of the URL.  This module is intended to take care of the different cases
       
    40 and provide a simpler interface to the Python script.  It also provides a number
       
    41 of utilities that help in debugging scripts, and the latest addition is support
       
    42 for file uploads from a form (if your browser supports it).
       
    43 
       
    44 The output of a CGI script should consist of two sections, separated by a blank
       
    45 line.  The first section contains a number of headers, telling the client what
       
    46 kind of data is following.  Python code to generate a minimal header section
       
    47 looks like this::
       
    48 
       
    49    print "Content-Type: text/html"     # HTML is following
       
    50    print                               # blank line, end of headers
       
    51 
       
    52 The second section is usually HTML, which allows the client software to display
       
    53 nicely formatted text with header, in-line images, etc. Here's Python code that
       
    54 prints a simple piece of HTML::
       
    55 
       
    56    print "<TITLE>CGI script output</TITLE>"
       
    57    print "<H1>This is my first CGI script</H1>"
       
    58    print "Hello, world!"
       
    59 
       
    60 
       
    61 .. _using-the-cgi-module:
       
    62 
       
    63 Using the cgi module
       
    64 --------------------
       
    65 
       
    66 Begin by writing ``import cgi``.  Do not use ``from cgi import *`` --- the
       
    67 module defines all sorts of names for its own use or for backward compatibility
       
    68 that you don't want in your namespace.
       
    69 
       
    70 When you write a new script, consider adding the line::
       
    71 
       
    72    import cgitb; cgitb.enable()
       
    73 
       
    74 This activates a special exception handler that will display detailed reports in
       
    75 the Web browser if any errors occur.  If you'd rather not show the guts of your
       
    76 program to users of your script, you can have the reports saved to files
       
    77 instead, with a line like this::
       
    78 
       
    79    import cgitb; cgitb.enable(display=0, logdir="/tmp")
       
    80 
       
    81 It's very helpful to use this feature during script development. The reports
       
    82 produced by :mod:`cgitb` provide information that can save you a lot of time in
       
    83 tracking down bugs.  You can always remove the ``cgitb`` line later when you
       
    84 have tested your script and are confident that it works correctly.
       
    85 
       
    86 To get at submitted form data, it's best to use the :class:`FieldStorage` class.
       
    87 The other classes defined in this module are provided mostly for backward
       
    88 compatibility. Instantiate it exactly once, without arguments.  This reads the
       
    89 form contents from standard input or the environment (depending on the value of
       
    90 various environment variables set according to the CGI standard).  Since it may
       
    91 consume standard input, it should be instantiated only once.
       
    92 
       
    93 The :class:`FieldStorage` instance can be indexed like a Python dictionary, and
       
    94 also supports the standard dictionary methods :meth:`has_key` and :meth:`keys`.
       
    95 The built-in :func:`len` is also supported.  Form fields containing empty
       
    96 strings are ignored and do not appear in the dictionary; to keep such values,
       
    97 provide a true value for the optional *keep_blank_values* keyword parameter when
       
    98 creating the :class:`FieldStorage` instance.
       
    99 
       
   100 For instance, the following code (which assumes that the
       
   101 :mailheader:`Content-Type` header and blank line have already been printed)
       
   102 checks that the fields ``name`` and ``addr`` are both set to a non-empty
       
   103 string::
       
   104 
       
   105    form = cgi.FieldStorage()
       
   106    if not (form.has_key("name") and form.has_key("addr")):
       
   107        print "<H1>Error</H1>"
       
   108        print "Please fill in the name and addr fields."
       
   109        return
       
   110    print "<p>name:", form["name"].value
       
   111    print "<p>addr:", form["addr"].value
       
   112    ...further form processing here...
       
   113 
       
   114 Here the fields, accessed through ``form[key]``, are themselves instances of
       
   115 :class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form
       
   116 encoding). The :attr:`value` attribute of the instance yields the string value
       
   117 of the field.  The :meth:`getvalue` method returns this string value directly;
       
   118 it also accepts an optional second argument as a default to return if the
       
   119 requested key is not present.
       
   120 
       
   121 If the submitted form data contains more than one field with the same name, the
       
   122 object retrieved by ``form[key]`` is not a :class:`FieldStorage` or
       
   123 :class:`MiniFieldStorage` instance but a list of such instances.  Similarly, in
       
   124 this situation, ``form.getvalue(key)`` would return a list of strings. If you
       
   125 expect this possibility (when your HTML form contains multiple fields with the
       
   126 same name), use the :func:`getlist` function, which always returns a list of
       
   127 values (so that you do not need to special-case the single item case).  For
       
   128 example, this code concatenates any number of username fields, separated by
       
   129 commas::
       
   130 
       
   131    value = form.getlist("username")
       
   132    usernames = ",".join(value)
       
   133 
       
   134 If a field represents an uploaded file, accessing the value via the
       
   135 :attr:`value` attribute or the :func:`getvalue` method reads the entire file in
       
   136 memory as a string.  This may not be what you want. You can test for an uploaded
       
   137 file by testing either the :attr:`filename` attribute or the :attr:`file`
       
   138 attribute.  You can then read the data at leisure from the :attr:`file`
       
   139 attribute::
       
   140 
       
   141    fileitem = form["userfile"]
       
   142    if fileitem.file:
       
   143        # It's an uploaded file; count lines
       
   144        linecount = 0
       
   145        while 1:
       
   146            line = fileitem.file.readline()
       
   147            if not line: break
       
   148            linecount = linecount + 1
       
   149 
       
   150 If an error is encountered when obtaining the contents of an uploaded file
       
   151 (for example, when the user interrupts the form submission by clicking on
       
   152 a Back or Cancel button) the :attr:`done` attribute of the object for the
       
   153 field will be set to the value -1.
       
   154 
       
   155 The file upload draft standard entertains the possibility of uploading multiple
       
   156 files from one field (using a recursive :mimetype:`multipart/\*` encoding).
       
   157 When this occurs, the item will be a dictionary-like :class:`FieldStorage` item.
       
   158 This can be determined by testing its :attr:`type` attribute, which should be
       
   159 :mimetype:`multipart/form-data` (or perhaps another MIME type matching
       
   160 :mimetype:`multipart/\*`).  In this case, it can be iterated over recursively
       
   161 just like the top-level form object.
       
   162 
       
   163 When a form is submitted in the "old" format (as the query string or as a single
       
   164 data part of type :mimetype:`application/x-www-form-urlencoded`), the items will
       
   165 actually be instances of the class :class:`MiniFieldStorage`.  In this case, the
       
   166 :attr:`list`, :attr:`file`, and :attr:`filename` attributes are always ``None``.
       
   167 
       
   168 A form submitted via POST that also has a query string will contain both
       
   169 :class:`FieldStorage` and :class:`MiniFieldStorage` items.
       
   170 
       
   171 Higher Level Interface
       
   172 ----------------------
       
   173 
       
   174 .. versionadded:: 2.2
       
   175 
       
   176 The previous section explains how to read CGI form data using the
       
   177 :class:`FieldStorage` class.  This section describes a higher level interface
       
   178 which was added to this class to allow one to do it in a more readable and
       
   179 intuitive way.  The interface doesn't make the techniques described in previous
       
   180 sections obsolete --- they are still useful to process file uploads efficiently,
       
   181 for example.
       
   182 
       
   183 .. XXX: Is this true ?
       
   184 
       
   185 The interface consists of two simple methods. Using the methods you can process
       
   186 form data in a generic way, without the need to worry whether only one or more
       
   187 values were posted under one name.
       
   188 
       
   189 In the previous section, you learned to write following code anytime you
       
   190 expected a user to post more than one value under one name::
       
   191 
       
   192    item = form.getvalue("item")
       
   193    if isinstance(item, list):
       
   194        # The user is requesting more than one item.
       
   195    else:
       
   196        # The user is requesting only one item.
       
   197 
       
   198 This situation is common for example when a form contains a group of multiple
       
   199 checkboxes with the same name::
       
   200 
       
   201    <input type="checkbox" name="item" value="1" />
       
   202    <input type="checkbox" name="item" value="2" />
       
   203 
       
   204 In most situations, however, there's only one form control with a particular
       
   205 name in a form and then you expect and need only one value associated with this
       
   206 name.  So you write a script containing for example this code::
       
   207 
       
   208    user = form.getvalue("user").upper()
       
   209 
       
   210 The problem with the code is that you should never expect that a client will
       
   211 provide valid input to your scripts.  For example, if a curious user appends
       
   212 another ``user=foo`` pair to the query string, then the script would crash,
       
   213 because in this situation the ``getvalue("user")`` method call returns a list
       
   214 instead of a string.  Calling the :meth:`toupper` method on a list is not valid
       
   215 (since lists do not have a method of this name) and results in an
       
   216 :exc:`AttributeError` exception.
       
   217 
       
   218 Therefore, the appropriate way to read form data values was to always use the
       
   219 code which checks whether the obtained value is a single value or a list of
       
   220 values.  That's annoying and leads to less readable scripts.
       
   221 
       
   222 A more convenient approach is to use the methods :meth:`getfirst` and
       
   223 :meth:`getlist` provided by this higher level interface.
       
   224 
       
   225 
       
   226 .. method:: FieldStorage.getfirst(name[, default])
       
   227 
       
   228    This method always returns only one value associated with form field *name*.
       
   229    The method returns only the first value in case that more values were posted
       
   230    under such name.  Please note that the order in which the values are received
       
   231    may vary from browser to browser and should not be counted on. [#]_  If no such
       
   232    form field or value exists then the method returns the value specified by the
       
   233    optional parameter *default*.  This parameter defaults to ``None`` if not
       
   234    specified.
       
   235 
       
   236 
       
   237 .. method:: FieldStorage.getlist(name)
       
   238 
       
   239    This method always returns a list of values associated with form field *name*.
       
   240    The method returns an empty list if no such form field or value exists for
       
   241    *name*.  It returns a list consisting of one item if only one such value exists.
       
   242 
       
   243 Using these methods you can write nice compact code::
       
   244 
       
   245    import cgi
       
   246    form = cgi.FieldStorage()
       
   247    user = form.getfirst("user", "").upper()    # This way it's safe.
       
   248    for item in form.getlist("item"):
       
   249        do_something(item)
       
   250 
       
   251 
       
   252 Old classes
       
   253 -----------
       
   254 
       
   255 .. deprecated:: 2.6
       
   256 
       
   257    These classes, present in earlier versions of the :mod:`cgi` module, are
       
   258    still supported for backward compatibility.  New applications should use the
       
   259    :class:`FieldStorage` class.
       
   260 
       
   261 :class:`SvFormContentDict` stores single value form content as dictionary; it
       
   262 assumes each field name occurs in the form only once.
       
   263 
       
   264 :class:`FormContentDict` stores multiple value form content as a dictionary (the
       
   265 form items are lists of values).  Useful if your form contains multiple fields
       
   266 with the same name.
       
   267 
       
   268 Other classes (:class:`FormContent`, :class:`InterpFormContentDict`) are present
       
   269 for backwards compatibility with really old applications only.
       
   270 
       
   271 
       
   272 .. _functions-in-cgi-module:
       
   273 
       
   274 Functions
       
   275 ---------
       
   276 
       
   277 These are useful if you want more control, or if you want to employ some of the
       
   278 algorithms implemented in this module in other circumstances.
       
   279 
       
   280 
       
   281 .. function:: parse(fp[, keep_blank_values[, strict_parsing]])
       
   282 
       
   283    Parse a query in the environment or from a file (the file defaults to
       
   284    ``sys.stdin``).  The *keep_blank_values* and *strict_parsing* parameters are
       
   285    passed to :func:`urlparse.parse_qs` unchanged.
       
   286 
       
   287 
       
   288 .. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]])
       
   289 
       
   290    This function is deprecated in this module. Use :func:`urlparse.parse_qs`
       
   291    instead. It is maintained here only for backward compatiblity.
       
   292 
       
   293 .. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]])
       
   294 
       
   295    This function is deprecated in this module. Use :func:`urlparse.parse_qsl`
       
   296    instead. It is maintained here only for backward compatiblity.
       
   297 
       
   298 .. function:: parse_multipart(fp, pdict)
       
   299 
       
   300    Parse input of type :mimetype:`multipart/form-data` (for  file uploads).
       
   301    Arguments are *fp* for the input file and *pdict* for a dictionary containing
       
   302    other parameters in the :mailheader:`Content-Type` header.
       
   303 
       
   304    Returns a dictionary just like :func:`urlparse.parse_qs` keys are the field names, each
       
   305    value is a list of values for that field.  This is easy to use but not much good
       
   306    if you are expecting megabytes to be uploaded --- in that case, use the
       
   307    :class:`FieldStorage` class instead which is much more flexible.
       
   308 
       
   309    Note that this does not parse nested multipart parts --- use
       
   310    :class:`FieldStorage` for that.
       
   311 
       
   312 
       
   313 .. function:: parse_header(string)
       
   314 
       
   315    Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a
       
   316    dictionary of parameters.
       
   317 
       
   318 
       
   319 .. function:: test()
       
   320 
       
   321    Robust test CGI script, usable as main program. Writes minimal HTTP headers and
       
   322    formats all information provided to the script in HTML form.
       
   323 
       
   324 
       
   325 .. function:: print_environ()
       
   326 
       
   327    Format the shell environment in HTML.
       
   328 
       
   329 
       
   330 .. function:: print_form(form)
       
   331 
       
   332    Format a form in HTML.
       
   333 
       
   334 
       
   335 .. function:: print_directory()
       
   336 
       
   337    Format the current directory in HTML.
       
   338 
       
   339 
       
   340 .. function:: print_environ_usage()
       
   341 
       
   342    Print a list of useful (used by CGI) environment variables in HTML.
       
   343 
       
   344 
       
   345 .. function:: escape(s[, quote])
       
   346 
       
   347    Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe
       
   348    sequences.  Use this if you need to display text that might contain such
       
   349    characters in HTML.  If the optional flag *quote* is true, the quotation mark
       
   350    character (``'"'``) is also translated; this helps for inclusion in an HTML
       
   351    attribute value, as in ``<A HREF="...">``.  If the value to be quoted might
       
   352    include single- or double-quote characters, or both, consider using the
       
   353    :func:`quoteattr` function in the :mod:`xml.sax.saxutils` module instead.
       
   354 
       
   355 
       
   356 .. _cgi-security:
       
   357 
       
   358 Caring about security
       
   359 ---------------------
       
   360 
       
   361 .. index:: pair: CGI; security
       
   362 
       
   363 There's one important rule: if you invoke an external program (via the
       
   364 :func:`os.system` or :func:`os.popen` functions. or others with similar
       
   365 functionality), make very sure you don't pass arbitrary strings received from
       
   366 the client to the shell.  This is a well-known security hole whereby clever
       
   367 hackers anywhere on the Web can exploit a gullible CGI script to invoke
       
   368 arbitrary shell commands.  Even parts of the URL or field names cannot be
       
   369 trusted, since the request doesn't have to come from your form!
       
   370 
       
   371 To be on the safe side, if you must pass a string gotten from a form to a shell
       
   372 command, you should make sure the string contains only alphanumeric characters,
       
   373 dashes, underscores, and periods.
       
   374 
       
   375 
       
   376 Installing your CGI script on a Unix system
       
   377 -------------------------------------------
       
   378 
       
   379 Read the documentation for your HTTP server and check with your local system
       
   380 administrator to find the directory where CGI scripts should be installed;
       
   381 usually this is in a directory :file:`cgi-bin` in the server tree.
       
   382 
       
   383 Make sure that your script is readable and executable by "others"; the Unix file
       
   384 mode should be ``0755`` octal (use ``chmod 0755 filename``).  Make sure that the
       
   385 first line of the script contains ``#!`` starting in column 1 followed by the
       
   386 pathname of the Python interpreter, for instance::
       
   387 
       
   388    #!/usr/local/bin/python
       
   389 
       
   390 Make sure the Python interpreter exists and is executable by "others".
       
   391 
       
   392 Make sure that any files your script needs to read or write are readable or
       
   393 writable, respectively, by "others" --- their mode should be ``0644`` for
       
   394 readable and ``0666`` for writable.  This is because, for security reasons, the
       
   395 HTTP server executes your script as user "nobody", without any special
       
   396 privileges.  It can only read (write, execute) files that everybody can read
       
   397 (write, execute).  The current directory at execution time is also different (it
       
   398 is usually the server's cgi-bin directory) and the set of environment variables
       
   399 is also different from what you get when you log in.  In particular, don't count
       
   400 on the shell's search path for executables (:envvar:`PATH`) or the Python module
       
   401 search path (:envvar:`PYTHONPATH`) to be set to anything interesting.
       
   402 
       
   403 If you need to load modules from a directory which is not on Python's default
       
   404 module search path, you can change the path in your script, before importing
       
   405 other modules.  For example::
       
   406 
       
   407    import sys
       
   408    sys.path.insert(0, "/usr/home/joe/lib/python")
       
   409    sys.path.insert(0, "/usr/local/lib/python")
       
   410 
       
   411 (This way, the directory inserted last will be searched first!)
       
   412 
       
   413 Instructions for non-Unix systems will vary; check your HTTP server's
       
   414 documentation (it will usually have a section on CGI scripts).
       
   415 
       
   416 
       
   417 Testing your CGI script
       
   418 -----------------------
       
   419 
       
   420 Unfortunately, a CGI script will generally not run when you try it from the
       
   421 command line, and a script that works perfectly from the command line may fail
       
   422 mysteriously when run from the server.  There's one reason why you should still
       
   423 test your script from the command line: if it contains a syntax error, the
       
   424 Python interpreter won't execute it at all, and the HTTP server will most likely
       
   425 send a cryptic error to the client.
       
   426 
       
   427 Assuming your script has no syntax errors, yet it does not work, you have no
       
   428 choice but to read the next section.
       
   429 
       
   430 
       
   431 Debugging CGI scripts
       
   432 ---------------------
       
   433 
       
   434 .. index:: pair: CGI; debugging
       
   435 
       
   436 First of all, check for trivial installation errors --- reading the section
       
   437 above on installing your CGI script carefully can save you a lot of time.  If
       
   438 you wonder whether you have understood the installation procedure correctly, try
       
   439 installing a copy of this module file (:file:`cgi.py`) as a CGI script.  When
       
   440 invoked as a script, the file will dump its environment and the contents of the
       
   441 form in HTML form. Give it the right mode etc, and send it a request.  If it's
       
   442 installed in the standard :file:`cgi-bin` directory, it should be possible to
       
   443 send it a request by entering a URL into your browser of the form::
       
   444 
       
   445    http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
       
   446 
       
   447 If this gives an error of type 404, the server cannot find the script -- perhaps
       
   448 you need to install it in a different directory.  If it gives another error,
       
   449 there's an installation problem that you should fix before trying to go any
       
   450 further.  If you get a nicely formatted listing of the environment and form
       
   451 content (in this example, the fields should be listed as "addr" with value "At
       
   452 Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been
       
   453 installed correctly.  If you follow the same procedure for your own script, you
       
   454 should now be able to debug it.
       
   455 
       
   456 The next step could be to call the :mod:`cgi` module's :func:`test` function
       
   457 from your script: replace its main code with the single statement ::
       
   458 
       
   459    cgi.test()
       
   460 
       
   461 This should produce the same results as those gotten from installing the
       
   462 :file:`cgi.py` file itself.
       
   463 
       
   464 When an ordinary Python script raises an unhandled exception (for whatever
       
   465 reason: of a typo in a module name, a file that can't be opened, etc.), the
       
   466 Python interpreter prints a nice traceback and exits.  While the Python
       
   467 interpreter will still do this when your CGI script raises an exception, most
       
   468 likely the traceback will end up in one of the HTTP server's log files, or be
       
   469 discarded altogether.
       
   470 
       
   471 Fortunately, once you have managed to get your script to execute *some* code,
       
   472 you can easily send tracebacks to the Web browser using the :mod:`cgitb` module.
       
   473 If you haven't done so already, just add the line::
       
   474 
       
   475    import cgitb; cgitb.enable()
       
   476 
       
   477 to the top of your script.  Then try running it again; when a problem occurs,
       
   478 you should see a detailed report that will likely make apparent the cause of the
       
   479 crash.
       
   480 
       
   481 If you suspect that there may be a problem in importing the :mod:`cgitb` module,
       
   482 you can use an even more robust approach (which only uses built-in modules)::
       
   483 
       
   484    import sys
       
   485    sys.stderr = sys.stdout
       
   486    print "Content-Type: text/plain"
       
   487    print
       
   488    ...your code here...
       
   489 
       
   490 This relies on the Python interpreter to print the traceback.  The content type
       
   491 of the output is set to plain text, which disables all HTML processing.  If your
       
   492 script works, the raw HTML will be displayed by your client.  If it raises an
       
   493 exception, most likely after the first two lines have been printed, a traceback
       
   494 will be displayed. Because no HTML interpretation is going on, the traceback
       
   495 will be readable.
       
   496 
       
   497 
       
   498 Common problems and solutions
       
   499 -----------------------------
       
   500 
       
   501 * Most HTTP servers buffer the output from CGI scripts until the script is
       
   502   completed.  This means that it is not possible to display a progress report on
       
   503   the client's display while the script is running.
       
   504 
       
   505 * Check the installation instructions above.
       
   506 
       
   507 * Check the HTTP server's log files.  (``tail -f logfile`` in a separate window
       
   508   may be useful!)
       
   509 
       
   510 * Always check a script for syntax errors first, by doing something like
       
   511   ``python script.py``.
       
   512 
       
   513 * If your script does not have any syntax errors, try adding ``import cgitb;
       
   514   cgitb.enable()`` to the top of the script.
       
   515 
       
   516 * When invoking external programs, make sure they can be found. Usually, this
       
   517   means using absolute path names --- :envvar:`PATH` is usually not set to a very
       
   518   useful value in a CGI script.
       
   519 
       
   520 * When reading or writing external files, make sure they can be read or written
       
   521   by the userid under which your CGI script will be running: this is typically the
       
   522   userid under which the web server is running, or some explicitly specified
       
   523   userid for a web server's ``suexec`` feature.
       
   524 
       
   525 * Don't try to give a CGI script a set-uid mode.  This doesn't work on most
       
   526   systems, and is a security liability as well.
       
   527 
       
   528 .. rubric:: Footnotes
       
   529 
       
   530 .. [#] Note that some recent versions of the HTML specification do state what order the
       
   531    field values should be supplied in, but knowing whether a request was
       
   532    received from a conforming browser, or even from a browser at all, is tedious
       
   533    and error-prone.
       
   534