|
1 :mod:`email`: Internationalized headers |
|
2 --------------------------------------- |
|
3 |
|
4 .. module:: email.header |
|
5 :synopsis: Representing non-ASCII headers |
|
6 |
|
7 |
|
8 :rfc:`2822` is the base standard that describes the format of email messages. |
|
9 It derives from the older :rfc:`822` standard which came into widespread use at |
|
10 a time when most email was composed of ASCII characters only. :rfc:`2822` is a |
|
11 specification written assuming email contains only 7-bit ASCII characters. |
|
12 |
|
13 Of course, as email has been deployed worldwide, it has become |
|
14 internationalized, such that language specific character sets can now be used in |
|
15 email messages. The base standard still requires email messages to be |
|
16 transferred using only 7-bit ASCII characters, so a slew of RFCs have been |
|
17 written describing how to encode email containing non-ASCII characters into |
|
18 :rfc:`2822`\ -compliant format. These RFCs include :rfc:`2045`, :rfc:`2046`, |
|
19 :rfc:`2047`, and :rfc:`2231`. The :mod:`email` package supports these standards |
|
20 in its :mod:`email.header` and :mod:`email.charset` modules. |
|
21 |
|
22 If you want to include non-ASCII characters in your email headers, say in the |
|
23 :mailheader:`Subject` or :mailheader:`To` fields, you should use the |
|
24 :class:`Header` class and assign the field in the :class:`Message` object to an |
|
25 instance of :class:`Header` instead of using a string for the header value. |
|
26 Import the :class:`Header` class from the :mod:`email.header` module. For |
|
27 example:: |
|
28 |
|
29 >>> from email.message import Message |
|
30 >>> from email.header import Header |
|
31 >>> msg = Message() |
|
32 >>> h = Header('p\xf6stal', 'iso-8859-1') |
|
33 >>> msg['Subject'] = h |
|
34 >>> print msg.as_string() |
|
35 Subject: =?iso-8859-1?q?p=F6stal?= |
|
36 |
|
37 |
|
38 |
|
39 Notice here how we wanted the :mailheader:`Subject` field to contain a non-ASCII |
|
40 character? We did this by creating a :class:`Header` instance and passing in |
|
41 the character set that the byte string was encoded in. When the subsequent |
|
42 :class:`Message` instance was flattened, the :mailheader:`Subject` field was |
|
43 properly :rfc:`2047` encoded. MIME-aware mail readers would show this header |
|
44 using the embedded ISO-8859-1 character. |
|
45 |
|
46 .. versionadded:: 2.2.2 |
|
47 |
|
48 Here is the :class:`Header` class description: |
|
49 |
|
50 |
|
51 .. class:: Header([s[, charset[, maxlinelen[, header_name[, continuation_ws[, errors]]]]]]) |
|
52 |
|
53 Create a MIME-compliant header that can contain strings in different character |
|
54 sets. |
|
55 |
|
56 Optional *s* is the initial header value. If ``None`` (the default), the |
|
57 initial header value is not set. You can later append to the header with |
|
58 :meth:`append` method calls. *s* may be a byte string or a Unicode string, but |
|
59 see the :meth:`append` documentation for semantics. |
|
60 |
|
61 Optional *charset* serves two purposes: it has the same meaning as the *charset* |
|
62 argument to the :meth:`append` method. It also sets the default character set |
|
63 for all subsequent :meth:`append` calls that omit the *charset* argument. If |
|
64 *charset* is not provided in the constructor (the default), the ``us-ascii`` |
|
65 character set is used both as *s*'s initial charset and as the default for |
|
66 subsequent :meth:`append` calls. |
|
67 |
|
68 The maximum line length can be specified explicit via *maxlinelen*. For |
|
69 splitting the first line to a shorter value (to account for the field header |
|
70 which isn't included in *s*, e.g. :mailheader:`Subject`) pass in the name of the |
|
71 field in *header_name*. The default *maxlinelen* is 76, and the default value |
|
72 for *header_name* is ``None``, meaning it is not taken into account for the |
|
73 first line of a long, split header. |
|
74 |
|
75 Optional *continuation_ws* must be :rfc:`2822`\ -compliant folding whitespace, |
|
76 and is usually either a space or a hard tab character. This character will be |
|
77 prepended to continuation lines. |
|
78 |
|
79 Optional *errors* is passed straight through to the :meth:`append` method. |
|
80 |
|
81 |
|
82 .. method:: append(s[, charset[, errors]]) |
|
83 |
|
84 Append the string *s* to the MIME header. |
|
85 |
|
86 Optional *charset*, if given, should be a :class:`Charset` instance (see |
|
87 :mod:`email.charset`) or the name of a character set, which will be |
|
88 converted to a :class:`Charset` instance. A value of ``None`` (the |
|
89 default) means that the *charset* given in the constructor is used. |
|
90 |
|
91 *s* may be a byte string or a Unicode string. If it is a byte string |
|
92 (i.e. ``isinstance(s, str)`` is true), then *charset* is the encoding of |
|
93 that byte string, and a :exc:`UnicodeError` will be raised if the string |
|
94 cannot be decoded with that character set. |
|
95 |
|
96 If *s* is a Unicode string, then *charset* is a hint specifying the |
|
97 character set of the characters in the string. In this case, when |
|
98 producing an :rfc:`2822`\ -compliant header using :rfc:`2047` rules, the |
|
99 Unicode string will be encoded using the following charsets in order: |
|
100 ``us-ascii``, the *charset* hint, ``utf-8``. The first character set to |
|
101 not provoke a :exc:`UnicodeError` is used. |
|
102 |
|
103 Optional *errors* is passed through to any :func:`unicode` or |
|
104 :func:`ustr.encode` call, and defaults to "strict". |
|
105 |
|
106 |
|
107 .. method:: encode([splitchars]) |
|
108 |
|
109 Encode a message header into an RFC-compliant format, possibly wrapping |
|
110 long lines and encapsulating non-ASCII parts in base64 or quoted-printable |
|
111 encodings. Optional *splitchars* is a string containing characters to |
|
112 split long ASCII lines on, in rough support of :rfc:`2822`'s *highest |
|
113 level syntactic breaks*. This doesn't affect :rfc:`2047` encoded lines. |
|
114 |
|
115 The :class:`Header` class also provides a number of methods to support |
|
116 standard operators and built-in functions. |
|
117 |
|
118 |
|
119 .. method:: __str__() |
|
120 |
|
121 A synonym for :meth:`Header.encode`. Useful for ``str(aHeader)``. |
|
122 |
|
123 |
|
124 .. method:: __unicode__() |
|
125 |
|
126 A helper for the built-in :func:`unicode` function. Returns the header as |
|
127 a Unicode string. |
|
128 |
|
129 |
|
130 .. method:: __eq__(other) |
|
131 |
|
132 This method allows you to compare two :class:`Header` instances for |
|
133 equality. |
|
134 |
|
135 |
|
136 .. method:: __ne__(other) |
|
137 |
|
138 This method allows you to compare two :class:`Header` instances for |
|
139 inequality. |
|
140 |
|
141 The :mod:`email.header` module also provides the following convenient functions. |
|
142 |
|
143 |
|
144 .. function:: decode_header(header) |
|
145 |
|
146 Decode a message header value without converting the character set. The header |
|
147 value is in *header*. |
|
148 |
|
149 This function returns a list of ``(decoded_string, charset)`` pairs containing |
|
150 each of the decoded parts of the header. *charset* is ``None`` for non-encoded |
|
151 parts of the header, otherwise a lower case string containing the name of the |
|
152 character set specified in the encoded string. |
|
153 |
|
154 Here's an example:: |
|
155 |
|
156 >>> from email.header import decode_header |
|
157 >>> decode_header('=?iso-8859-1?q?p=F6stal?=') |
|
158 [('p\xf6stal', 'iso-8859-1')] |
|
159 |
|
160 |
|
161 .. function:: make_header(decoded_seq[, maxlinelen[, header_name[, continuation_ws]]]) |
|
162 |
|
163 Create a :class:`Header` instance from a sequence of pairs as returned by |
|
164 :func:`decode_header`. |
|
165 |
|
166 :func:`decode_header` takes a header value string and returns a sequence of |
|
167 pairs of the format ``(decoded_string, charset)`` where *charset* is the name of |
|
168 the character set. |
|
169 |
|
170 This function takes one of those sequence of pairs and returns a :class:`Header` |
|
171 instance. Optional *maxlinelen*, *header_name*, and *continuation_ws* are as in |
|
172 the :class:`Header` constructor. |
|
173 |