|
1 .. _tarfile-mod: |
|
2 |
|
3 :mod:`tarfile` --- Read and write tar archive files |
|
4 =================================================== |
|
5 |
|
6 .. module:: tarfile |
|
7 :synopsis: Read and write tar-format archive files. |
|
8 |
|
9 |
|
10 .. versionadded:: 2.3 |
|
11 |
|
12 .. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> |
|
13 .. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> |
|
14 |
|
15 |
|
16 The :mod:`tarfile` module makes it possible to read and write tar |
|
17 archives, including those using gzip or bz2 compression. |
|
18 (:file:`.zip` files can be read and written using the :mod:`zipfile` module.) |
|
19 |
|
20 Some facts and figures: |
|
21 |
|
22 * reads and writes :mod:`gzip` and :mod:`bz2` compressed archives. |
|
23 |
|
24 * read/write support for the POSIX.1-1988 (ustar) format. |
|
25 |
|
26 * read/write support for the GNU tar format including *longname* and *longlink* |
|
27 extensions, read-only support for the *sparse* extension. |
|
28 |
|
29 * read/write support for the POSIX.1-2001 (pax) format. |
|
30 |
|
31 .. versionadded:: 2.6 |
|
32 |
|
33 * handles directories, regular files, hardlinks, symbolic links, fifos, |
|
34 character devices and block devices and is able to acquire and restore file |
|
35 information like timestamp, access permissions and owner. |
|
36 |
|
37 |
|
38 .. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, \*\*kwargs) |
|
39 |
|
40 Return a :class:`TarFile` object for the pathname *name*. For detailed |
|
41 information on :class:`TarFile` objects and the keyword arguments that are |
|
42 allowed, see :ref:`tarfile-objects`. |
|
43 |
|
44 *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults |
|
45 to ``'r'``. Here is a full list of mode combinations: |
|
46 |
|
47 +------------------+---------------------------------------------+ |
|
48 | mode | action | |
|
49 +==================+=============================================+ |
|
50 | ``'r' or 'r:*'`` | Open for reading with transparent | |
|
51 | | compression (recommended). | |
|
52 +------------------+---------------------------------------------+ |
|
53 | ``'r:'`` | Open for reading exclusively without | |
|
54 | | compression. | |
|
55 +------------------+---------------------------------------------+ |
|
56 | ``'r:gz'`` | Open for reading with gzip compression. | |
|
57 +------------------+---------------------------------------------+ |
|
58 | ``'r:bz2'`` | Open for reading with bzip2 compression. | |
|
59 +------------------+---------------------------------------------+ |
|
60 | ``'a' or 'a:'`` | Open for appending with no compression. The | |
|
61 | | file is created if it does not exist. | |
|
62 +------------------+---------------------------------------------+ |
|
63 | ``'w' or 'w:'`` | Open for uncompressed writing. | |
|
64 +------------------+---------------------------------------------+ |
|
65 | ``'w:gz'`` | Open for gzip compressed writing. | |
|
66 +------------------+---------------------------------------------+ |
|
67 | ``'w:bz2'`` | Open for bzip2 compressed writing. | |
|
68 +------------------+---------------------------------------------+ |
|
69 |
|
70 Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If *mode* is not suitable |
|
71 to open a certain (compressed) file for reading, :exc:`ReadError` is raised. Use |
|
72 *mode* ``'r'`` to avoid this. If a compression method is not supported, |
|
73 :exc:`CompressionError` is raised. |
|
74 |
|
75 If *fileobj* is specified, it is used as an alternative to a file object opened |
|
76 for *name*. It is supposed to be at position 0. |
|
77 |
|
78 For special purposes, there is a second format for *mode*: |
|
79 ``'filemode|[compression]'``. :func:`tarfile.open` will return a :class:`TarFile` |
|
80 object that processes its data as a stream of blocks. No random seeking will |
|
81 be done on the file. If given, *fileobj* may be any object that has a |
|
82 :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize* |
|
83 specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant |
|
84 in combination with e.g. ``sys.stdin``, a socket file object or a tape |
|
85 device. However, such a :class:`TarFile` object is limited in that it does |
|
86 not allow to be accessed randomly, see :ref:`tar-examples`. The currently |
|
87 possible modes: |
|
88 |
|
89 +-------------+--------------------------------------------+ |
|
90 | Mode | Action | |
|
91 +=============+============================================+ |
|
92 | ``'r|*'`` | Open a *stream* of tar blocks for reading | |
|
93 | | with transparent compression. | |
|
94 +-------------+--------------------------------------------+ |
|
95 | ``'r|'`` | Open a *stream* of uncompressed tar blocks | |
|
96 | | for reading. | |
|
97 +-------------+--------------------------------------------+ |
|
98 | ``'r|gz'`` | Open a gzip compressed *stream* for | |
|
99 | | reading. | |
|
100 +-------------+--------------------------------------------+ |
|
101 | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | |
|
102 | | reading. | |
|
103 +-------------+--------------------------------------------+ |
|
104 | ``'w|'`` | Open an uncompressed *stream* for writing. | |
|
105 +-------------+--------------------------------------------+ |
|
106 | ``'w|gz'`` | Open an gzip compressed *stream* for | |
|
107 | | writing. | |
|
108 +-------------+--------------------------------------------+ |
|
109 | ``'w|bz2'`` | Open an bzip2 compressed *stream* for | |
|
110 | | writing. | |
|
111 +-------------+--------------------------------------------+ |
|
112 |
|
113 |
|
114 .. class:: TarFile |
|
115 |
|
116 Class for reading and writing tar archives. Do not use this class directly, |
|
117 better use :func:`tarfile.open` instead. See :ref:`tarfile-objects`. |
|
118 |
|
119 |
|
120 .. function:: is_tarfile(name) |
|
121 |
|
122 Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile` |
|
123 module can read. |
|
124 |
|
125 |
|
126 .. class:: TarFileCompat(filename, mode='r', compression=TAR_PLAIN) |
|
127 |
|
128 Class for limited access to tar archives with a :mod:`zipfile`\ -like interface. |
|
129 Please consult the documentation of the :mod:`zipfile` module for more details. |
|
130 *compression* must be one of the following constants: |
|
131 |
|
132 |
|
133 .. data:: TAR_PLAIN |
|
134 |
|
135 Constant for an uncompressed tar archive. |
|
136 |
|
137 |
|
138 .. data:: TAR_GZIPPED |
|
139 |
|
140 Constant for a :mod:`gzip` compressed tar archive. |
|
141 |
|
142 |
|
143 .. deprecated:: 2.6 |
|
144 The :class:`TarFileCompat` class has been deprecated for removal in Python 3.0. |
|
145 |
|
146 |
|
147 .. exception:: TarError |
|
148 |
|
149 Base class for all :mod:`tarfile` exceptions. |
|
150 |
|
151 |
|
152 .. exception:: ReadError |
|
153 |
|
154 Is raised when a tar archive is opened, that either cannot be handled by the |
|
155 :mod:`tarfile` module or is somehow invalid. |
|
156 |
|
157 |
|
158 .. exception:: CompressionError |
|
159 |
|
160 Is raised when a compression method is not supported or when the data cannot be |
|
161 decoded properly. |
|
162 |
|
163 |
|
164 .. exception:: StreamError |
|
165 |
|
166 Is raised for the limitations that are typical for stream-like :class:`TarFile` |
|
167 objects. |
|
168 |
|
169 |
|
170 .. exception:: ExtractError |
|
171 |
|
172 Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if |
|
173 :attr:`TarFile.errorlevel`\ ``== 2``. |
|
174 |
|
175 |
|
176 .. exception:: HeaderError |
|
177 |
|
178 Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid. |
|
179 |
|
180 .. versionadded:: 2.6 |
|
181 |
|
182 |
|
183 Each of the following constants defines a tar archive format that the |
|
184 :mod:`tarfile` module is able to create. See section :ref:`tar-formats` for |
|
185 details. |
|
186 |
|
187 |
|
188 .. data:: USTAR_FORMAT |
|
189 |
|
190 POSIX.1-1988 (ustar) format. |
|
191 |
|
192 |
|
193 .. data:: GNU_FORMAT |
|
194 |
|
195 GNU tar format. |
|
196 |
|
197 |
|
198 .. data:: PAX_FORMAT |
|
199 |
|
200 POSIX.1-2001 (pax) format. |
|
201 |
|
202 |
|
203 .. data:: DEFAULT_FORMAT |
|
204 |
|
205 The default format for creating archives. This is currently :const:`GNU_FORMAT`. |
|
206 |
|
207 |
|
208 The following variables are available on module level: |
|
209 |
|
210 |
|
211 .. data:: ENCODING |
|
212 |
|
213 The default character encoding i.e. the value from either |
|
214 :func:`sys.getfilesystemencoding` or :func:`sys.getdefaultencoding`. |
|
215 |
|
216 |
|
217 .. seealso:: |
|
218 |
|
219 Module :mod:`zipfile` |
|
220 Documentation of the :mod:`zipfile` standard module. |
|
221 |
|
222 `GNU tar manual, Basic Tar Format <http://www.gnu.org/software/tar/manual/html_node/Standard.html>`_ |
|
223 Documentation for tar archive files, including GNU tar extensions. |
|
224 |
|
225 |
|
226 .. _tarfile-objects: |
|
227 |
|
228 TarFile Objects |
|
229 --------------- |
|
230 |
|
231 The :class:`TarFile` object provides an interface to a tar archive. A tar |
|
232 archive is a sequence of blocks. An archive member (a stored file) is made up of |
|
233 a header block followed by data blocks. It is possible to store a file in a tar |
|
234 archive several times. Each archive member is represented by a :class:`TarInfo` |
|
235 object, see :ref:`tarinfo-objects` for details. |
|
236 |
|
237 |
|
238 .. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors=None, pax_headers=None, debug=0, errorlevel=0) |
|
239 |
|
240 All following arguments are optional and can be accessed as instance attributes |
|
241 as well. |
|
242 |
|
243 *name* is the pathname of the archive. It can be omitted if *fileobj* is given. |
|
244 In this case, the file object's :attr:`name` attribute is used if it exists. |
|
245 |
|
246 *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append |
|
247 data to an existing file or ``'w'`` to create a new file overwriting an existing |
|
248 one. |
|
249 |
|
250 If *fileobj* is given, it is used for reading or writing data. If it can be |
|
251 determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used |
|
252 from position 0. |
|
253 |
|
254 .. note:: |
|
255 |
|
256 *fileobj* is not closed, when :class:`TarFile` is closed. |
|
257 |
|
258 *format* controls the archive format. It must be one of the constants |
|
259 :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are |
|
260 defined at module level. |
|
261 |
|
262 .. versionadded:: 2.6 |
|
263 |
|
264 The *tarinfo* argument can be used to replace the default :class:`TarInfo` class |
|
265 with a different one. |
|
266 |
|
267 .. versionadded:: 2.6 |
|
268 |
|
269 If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it |
|
270 is :const:`True`, add the content of the target files to the archive. This has no |
|
271 effect on systems that do not support symbolic links. |
|
272 |
|
273 If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive. |
|
274 If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members |
|
275 as possible. This is only useful for reading concatenated or damaged archives. |
|
276 |
|
277 *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug |
|
278 messages). The messages are written to ``sys.stderr``. |
|
279 |
|
280 If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`. |
|
281 Nevertheless, they appear as error messages in the debug output, when debugging |
|
282 is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` or |
|
283 :exc:`IOError` exceptions. If ``2``, all *non-fatal* errors are raised as |
|
284 :exc:`TarError` exceptions as well. |
|
285 |
|
286 The *encoding* and *errors* arguments control the way strings are converted to |
|
287 unicode objects and vice versa. The default settings will work for most users. |
|
288 See section :ref:`tar-unicode` for in-depth information. |
|
289 |
|
290 .. versionadded:: 2.6 |
|
291 |
|
292 The *pax_headers* argument is an optional dictionary of unicode strings which |
|
293 will be added as a pax global header if *format* is :const:`PAX_FORMAT`. |
|
294 |
|
295 .. versionadded:: 2.6 |
|
296 |
|
297 |
|
298 .. method:: TarFile.open(...) |
|
299 |
|
300 Alternative constructor. The :func:`tarfile.open` function is actually a |
|
301 shortcut to this classmethod. |
|
302 |
|
303 |
|
304 .. method:: TarFile.getmember(name) |
|
305 |
|
306 Return a :class:`TarInfo` object for member *name*. If *name* can not be found |
|
307 in the archive, :exc:`KeyError` is raised. |
|
308 |
|
309 .. note:: |
|
310 |
|
311 If a member occurs more than once in the archive, its last occurrence is assumed |
|
312 to be the most up-to-date version. |
|
313 |
|
314 |
|
315 .. method:: TarFile.getmembers() |
|
316 |
|
317 Return the members of the archive as a list of :class:`TarInfo` objects. The |
|
318 list has the same order as the members in the archive. |
|
319 |
|
320 |
|
321 .. method:: TarFile.getnames() |
|
322 |
|
323 Return the members as a list of their names. It has the same order as the list |
|
324 returned by :meth:`getmembers`. |
|
325 |
|
326 |
|
327 .. method:: TarFile.list(verbose=True) |
|
328 |
|
329 Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`, |
|
330 only the names of the members are printed. If it is :const:`True`, output |
|
331 similar to that of :program:`ls -l` is produced. |
|
332 |
|
333 |
|
334 .. method:: TarFile.next() |
|
335 |
|
336 Return the next member of the archive as a :class:`TarInfo` object, when |
|
337 :class:`TarFile` is opened for reading. Return :const:`None` if there is no more |
|
338 available. |
|
339 |
|
340 |
|
341 .. method:: TarFile.extractall(path=".", members=None) |
|
342 |
|
343 Extract all members from the archive to the current working directory or |
|
344 directory *path*. If optional *members* is given, it must be a subset of the |
|
345 list returned by :meth:`getmembers`. Directory information like owner, |
|
346 modification time and permissions are set after all members have been extracted. |
|
347 This is done to work around two problems: A directory's modification time is |
|
348 reset each time a file is created in it. And, if a directory's permissions do |
|
349 not allow writing, extracting files to it will fail. |
|
350 |
|
351 .. warning:: |
|
352 |
|
353 Never extract archives from untrusted sources without prior inspection. |
|
354 It is possible that files are created outside of *path*, e.g. members |
|
355 that have absolute filenames starting with ``"/"`` or filenames with two |
|
356 dots ``".."``. |
|
357 |
|
358 .. versionadded:: 2.5 |
|
359 |
|
360 |
|
361 .. method:: TarFile.extract(member, path="") |
|
362 |
|
363 Extract a member from the archive to the current working directory, using its |
|
364 full name. Its file information is extracted as accurately as possible. *member* |
|
365 may be a filename or a :class:`TarInfo` object. You can specify a different |
|
366 directory using *path*. |
|
367 |
|
368 .. note:: |
|
369 |
|
370 The :meth:`extract` method does not take care of several extraction issues. |
|
371 In most cases you should consider using the :meth:`extractall` method. |
|
372 |
|
373 .. warning:: |
|
374 |
|
375 See the warning for :meth:`extractall`. |
|
376 |
|
377 |
|
378 .. method:: TarFile.extractfile(member) |
|
379 |
|
380 Extract a member from the archive as a file object. *member* may be a filename |
|
381 or a :class:`TarInfo` object. If *member* is a regular file, a file-like object |
|
382 is returned. If *member* is a link, a file-like object is constructed from the |
|
383 link's target. If *member* is none of the above, :const:`None` is returned. |
|
384 |
|
385 .. note:: |
|
386 |
|
387 The file-like object is read-only and provides the following methods: |
|
388 :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`seek`, :meth:`tell`. |
|
389 |
|
390 |
|
391 .. method:: TarFile.add(name, arcname=None, recursive=True, exclude=None) |
|
392 |
|
393 Add the file *name* to the archive. *name* may be any type of file (directory, |
|
394 fifo, symbolic link, etc.). If given, *arcname* specifies an alternative name |
|
395 for the file in the archive. Directories are added recursively by default. This |
|
396 can be avoided by setting *recursive* to :const:`False`. If *exclude* is given |
|
397 it must be a function that takes one filename argument and returns a boolean |
|
398 value. Depending on this value the respective file is either excluded |
|
399 (:const:`True`) or added (:const:`False`). |
|
400 |
|
401 .. versionchanged:: 2.6 |
|
402 Added the *exclude* parameter. |
|
403 |
|
404 |
|
405 .. method:: TarFile.addfile(tarinfo, fileobj=None) |
|
406 |
|
407 Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given, |
|
408 ``tarinfo.size`` bytes are read from it and added to the archive. You can |
|
409 create :class:`TarInfo` objects using :meth:`gettarinfo`. |
|
410 |
|
411 .. note:: |
|
412 |
|
413 On Windows platforms, *fileobj* should always be opened with mode ``'rb'`` to |
|
414 avoid irritation about the file size. |
|
415 |
|
416 |
|
417 .. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None) |
|
418 |
|
419 Create a :class:`TarInfo` object for either the file *name* or the file object |
|
420 *fileobj* (using :func:`os.fstat` on its file descriptor). You can modify some |
|
421 of the :class:`TarInfo`'s attributes before you add it using :meth:`addfile`. |
|
422 If given, *arcname* specifies an alternative name for the file in the archive. |
|
423 |
|
424 |
|
425 .. method:: TarFile.close() |
|
426 |
|
427 Close the :class:`TarFile`. In write mode, two finishing zero blocks are |
|
428 appended to the archive. |
|
429 |
|
430 |
|
431 .. attribute:: TarFile.posix |
|
432 |
|
433 Setting this to :const:`True` is equivalent to setting the :attr:`format` |
|
434 attribute to :const:`USTAR_FORMAT`, :const:`False` is equivalent to |
|
435 :const:`GNU_FORMAT`. |
|
436 |
|
437 .. versionchanged:: 2.4 |
|
438 *posix* defaults to :const:`False`. |
|
439 |
|
440 .. deprecated:: 2.6 |
|
441 Use the :attr:`format` attribute instead. |
|
442 |
|
443 |
|
444 .. attribute:: TarFile.pax_headers |
|
445 |
|
446 A dictionary containing key-value pairs of pax global headers. |
|
447 |
|
448 .. versionadded:: 2.6 |
|
449 |
|
450 |
|
451 .. _tarinfo-objects: |
|
452 |
|
453 TarInfo Objects |
|
454 --------------- |
|
455 |
|
456 A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside |
|
457 from storing all required attributes of a file (like file type, size, time, |
|
458 permissions, owner etc.), it provides some useful methods to determine its type. |
|
459 It does *not* contain the file's data itself. |
|
460 |
|
461 :class:`TarInfo` objects are returned by :class:`TarFile`'s methods |
|
462 :meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`. |
|
463 |
|
464 |
|
465 .. class:: TarInfo(name="") |
|
466 |
|
467 Create a :class:`TarInfo` object. |
|
468 |
|
469 |
|
470 .. method:: TarInfo.frombuf(buf) |
|
471 |
|
472 Create and return a :class:`TarInfo` object from string buffer *buf*. |
|
473 |
|
474 .. versionadded:: 2.6 |
|
475 Raises :exc:`HeaderError` if the buffer is invalid.. |
|
476 |
|
477 |
|
478 .. method:: TarInfo.fromtarfile(tarfile) |
|
479 |
|
480 Read the next member from the :class:`TarFile` object *tarfile* and return it as |
|
481 a :class:`TarInfo` object. |
|
482 |
|
483 .. versionadded:: 2.6 |
|
484 |
|
485 |
|
486 .. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='strict') |
|
487 |
|
488 Create a string buffer from a :class:`TarInfo` object. For information on the |
|
489 arguments see the constructor of the :class:`TarFile` class. |
|
490 |
|
491 .. versionchanged:: 2.6 |
|
492 The arguments were added. |
|
493 |
|
494 A ``TarInfo`` object has the following public data attributes: |
|
495 |
|
496 |
|
497 .. attribute:: TarInfo.name |
|
498 |
|
499 Name of the archive member. |
|
500 |
|
501 |
|
502 .. attribute:: TarInfo.size |
|
503 |
|
504 Size in bytes. |
|
505 |
|
506 |
|
507 .. attribute:: TarInfo.mtime |
|
508 |
|
509 Time of last modification. |
|
510 |
|
511 |
|
512 .. attribute:: TarInfo.mode |
|
513 |
|
514 Permission bits. |
|
515 |
|
516 |
|
517 .. attribute:: TarInfo.type |
|
518 |
|
519 File type. *type* is usually one of these constants: :const:`REGTYPE`, |
|
520 :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`, |
|
521 :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`, |
|
522 :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object |
|
523 more conveniently, use the ``is_*()`` methods below. |
|
524 |
|
525 |
|
526 .. attribute:: TarInfo.linkname |
|
527 |
|
528 Name of the target file name, which is only present in :class:`TarInfo` objects |
|
529 of type :const:`LNKTYPE` and :const:`SYMTYPE`. |
|
530 |
|
531 |
|
532 .. attribute:: TarInfo.uid |
|
533 |
|
534 User ID of the user who originally stored this member. |
|
535 |
|
536 |
|
537 .. attribute:: TarInfo.gid |
|
538 |
|
539 Group ID of the user who originally stored this member. |
|
540 |
|
541 |
|
542 .. attribute:: TarInfo.uname |
|
543 |
|
544 User name. |
|
545 |
|
546 |
|
547 .. attribute:: TarInfo.gname |
|
548 |
|
549 Group name. |
|
550 |
|
551 |
|
552 .. attribute:: TarInfo.pax_headers |
|
553 |
|
554 A dictionary containing key-value pairs of an associated pax extended header. |
|
555 |
|
556 .. versionadded:: 2.6 |
|
557 |
|
558 A :class:`TarInfo` object also provides some convenient query methods: |
|
559 |
|
560 |
|
561 .. method:: TarInfo.isfile() |
|
562 |
|
563 Return :const:`True` if the :class:`Tarinfo` object is a regular file. |
|
564 |
|
565 |
|
566 .. method:: TarInfo.isreg() |
|
567 |
|
568 Same as :meth:`isfile`. |
|
569 |
|
570 |
|
571 .. method:: TarInfo.isdir() |
|
572 |
|
573 Return :const:`True` if it is a directory. |
|
574 |
|
575 |
|
576 .. method:: TarInfo.issym() |
|
577 |
|
578 Return :const:`True` if it is a symbolic link. |
|
579 |
|
580 |
|
581 .. method:: TarInfo.islnk() |
|
582 |
|
583 Return :const:`True` if it is a hard link. |
|
584 |
|
585 |
|
586 .. method:: TarInfo.ischr() |
|
587 |
|
588 Return :const:`True` if it is a character device. |
|
589 |
|
590 |
|
591 .. method:: TarInfo.isblk() |
|
592 |
|
593 Return :const:`True` if it is a block device. |
|
594 |
|
595 |
|
596 .. method:: TarInfo.isfifo() |
|
597 |
|
598 Return :const:`True` if it is a FIFO. |
|
599 |
|
600 |
|
601 .. method:: TarInfo.isdev() |
|
602 |
|
603 Return :const:`True` if it is one of character device, block device or FIFO. |
|
604 |
|
605 |
|
606 .. _tar-examples: |
|
607 |
|
608 Examples |
|
609 -------- |
|
610 |
|
611 How to extract an entire tar archive to the current working directory:: |
|
612 |
|
613 import tarfile |
|
614 tar = tarfile.open("sample.tar.gz") |
|
615 tar.extractall() |
|
616 tar.close() |
|
617 |
|
618 How to extract a subset of a tar archive with :meth:`TarFile.extractall` using |
|
619 a generator function instead of a list:: |
|
620 |
|
621 import os |
|
622 import tarfile |
|
623 |
|
624 def py_files(members): |
|
625 for tarinfo in members: |
|
626 if os.path.splitext(tarinfo.name)[1] == ".py": |
|
627 yield tarinfo |
|
628 |
|
629 tar = tarfile.open("sample.tar.gz") |
|
630 tar.extractall(members=py_files(tar)) |
|
631 tar.close() |
|
632 |
|
633 How to create an uncompressed tar archive from a list of filenames:: |
|
634 |
|
635 import tarfile |
|
636 tar = tarfile.open("sample.tar", "w") |
|
637 for name in ["foo", "bar", "quux"]: |
|
638 tar.add(name) |
|
639 tar.close() |
|
640 |
|
641 How to read a gzip compressed tar archive and display some member information:: |
|
642 |
|
643 import tarfile |
|
644 tar = tarfile.open("sample.tar.gz", "r:gz") |
|
645 for tarinfo in tar: |
|
646 print tarinfo.name, "is", tarinfo.size, "bytes in size and is", |
|
647 if tarinfo.isreg(): |
|
648 print "a regular file." |
|
649 elif tarinfo.isdir(): |
|
650 print "a directory." |
|
651 else: |
|
652 print "something else." |
|
653 tar.close() |
|
654 |
|
655 |
|
656 .. _tar-formats: |
|
657 |
|
658 Supported tar formats |
|
659 --------------------- |
|
660 |
|
661 There are three tar formats that can be created with the :mod:`tarfile` module: |
|
662 |
|
663 * The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames |
|
664 up to a length of at best 256 characters and linknames up to 100 characters. The |
|
665 maximum file size is 8 gigabytes. This is an old and limited but widely |
|
666 supported format. |
|
667 |
|
668 * The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and |
|
669 linknames, files bigger than 8 gigabytes and sparse files. It is the de facto |
|
670 standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar |
|
671 extensions for long names, sparse file support is read-only. |
|
672 |
|
673 * The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible |
|
674 format with virtually no limits. It supports long filenames and linknames, large |
|
675 files and stores pathnames in a portable way. However, not all tar |
|
676 implementations today are able to handle pax archives properly. |
|
677 |
|
678 The *pax* format is an extension to the existing *ustar* format. It uses extra |
|
679 headers for information that cannot be stored otherwise. There are two flavours |
|
680 of pax headers: Extended headers only affect the subsequent file header, global |
|
681 headers are valid for the complete archive and affect all following files. All |
|
682 the data in a pax header is encoded in *UTF-8* for portability reasons. |
|
683 |
|
684 There are some more variants of the tar format which can be read, but not |
|
685 created: |
|
686 |
|
687 * The ancient V7 format. This is the first tar format from Unix Seventh Edition, |
|
688 storing only regular files and directories. Names must not be longer than 100 |
|
689 characters, there is no user/group name information. Some archives have |
|
690 miscalculated header checksums in case of fields with non-ASCII characters. |
|
691 |
|
692 * The SunOS tar extended format. This format is a variant of the POSIX.1-2001 |
|
693 pax format, but is not compatible. |
|
694 |
|
695 .. _tar-unicode: |
|
696 |
|
697 Unicode issues |
|
698 -------------- |
|
699 |
|
700 The tar format was originally conceived to make backups on tape drives with the |
|
701 main focus on preserving file system information. Nowadays tar archives are |
|
702 commonly used for file distribution and exchanging archives over networks. One |
|
703 problem of the original format (that all other formats are merely variants of) |
|
704 is that there is no concept of supporting different character encodings. For |
|
705 example, an ordinary tar archive created on a *UTF-8* system cannot be read |
|
706 correctly on a *Latin-1* system if it contains non-ASCII characters. Names (i.e. |
|
707 filenames, linknames, user/group names) containing these characters will appear |
|
708 damaged. Unfortunately, there is no way to autodetect the encoding of an |
|
709 archive. |
|
710 |
|
711 The pax format was designed to solve this problem. It stores non-ASCII names |
|
712 using the universal character encoding *UTF-8*. When a pax archive is read, |
|
713 these *UTF-8* names are converted to the encoding of the local file system. |
|
714 |
|
715 The details of unicode conversion are controlled by the *encoding* and *errors* |
|
716 keyword arguments of the :class:`TarFile` class. |
|
717 |
|
718 The default value for *encoding* is the local character encoding. It is deduced |
|
719 from :func:`sys.getfilesystemencoding` and :func:`sys.getdefaultencoding`. In |
|
720 read mode, *encoding* is used exclusively to convert unicode names from a pax |
|
721 archive to strings in the local character encoding. In write mode, the use of |
|
722 *encoding* depends on the chosen archive format. In case of :const:`PAX_FORMAT`, |
|
723 input names that contain non-ASCII characters need to be decoded before being |
|
724 stored as *UTF-8* strings. The other formats do not make use of *encoding* |
|
725 unless unicode objects are used as input names. These are converted to 8-bit |
|
726 character strings before they are added to the archive. |
|
727 |
|
728 The *errors* argument defines how characters are treated that cannot be |
|
729 converted to or from *encoding*. Possible values are listed in section |
|
730 :ref:`codec-base-classes`. In read mode, there is an additional scheme |
|
731 ``'utf-8'`` which means that bad characters are replaced by their *UTF-8* |
|
732 representation. This is the default scheme. In write mode the default value for |
|
733 *errors* is ``'strict'`` to ensure that name information is not altered |
|
734 unnoticed. |
|
735 |