Patool

Introduction

XKCD Tar comic

I could never remember the correct options for all those different compression programs. Tar, unzip, gzip - you name it and I forgot it. Patool remembers all those options for me now so I don’t have to.

Description

Various archive types can be created, extracted, tested, listed, compared, searched and repacked with patool. The advantage of patool is its simplicity in handling archive files without having to remember a myriad of programs and options.

The archive format is determined by the file(1) program and as a fallback by the archive file extension.

patool supports 7z (.7z, .cb7), ACE (.ace, .cba), ADF (.adf), ALZIP (.alz), APE (.ape), AR (.a), ARC (.arc), ARJ (.arj), BZIP2 (.bz2), BZIP3 (.bz3), CAB (.cab), CHM (.chm), COMPRESS (.Z), CPIO (.cpio), DEB (.deb), DMS (.dms), FLAC (.flac), GZIP (.gz), ISO (.iso), LRZIP (.lrz), LZH (.lha, .lzh), LZIP (.lz), LZMA (.lzma), LZOP (.lzo), RPM (.rpm), RAR (.rar, .cbr), RZIP (.rz), SHN (.shn), TAR (.tar, .cbt), UDF (.udf), XZ (.xz), ZIP (.zip, .jar, .cbz), ZOO (.zoo) and ZSTANDARD (.zst) archive formats.

It relies on helper applications to handle those archive formats (for example xz for XZ (.xz) archives).

The archive formats TAR, ZIP, BZIP2 and GZIP are supported natively and do not require helper applications to be installed.

Installation

The easy way with pip:

sudo pip install patool

And on Windows:

py.exe -m pip install patool

You will need Python 3.10 or later.

For more information, especially for installing additional tools on Windows, read the detailed installation instructions.

Running

After installation there should be a /usr/bin/patool binary under Unix systems, under Windows should exist a file c:\python3\scripts\patool.

Use patool to run for Linux or OSX systems, on Windows use c:\python3\python3.exe c:\python3\scripts\patool.

See the following chapter for usage examples.

Examples

# extract two archives
patool extract archive.zip otherarchive.rar

# test if archive is intact
patool test --verbose dist.tar.gz

# list files inside an archive
patool list package.deb

# create a new archive
patool create --verbose myfiles.zip file1.txt dir/

# list differences between two archive contents
patool diff release1.0.tar.gz release2.0.zip

# search archive contents
patool search "def urlopen" python-3.3.tar.gz

# compress the archive in a different format
patool repack linux-2.6.33.tar.gz linux-2.6.33.tar.bz2

API

If you install patool, there is also a Python module patoolib. You can use functions in patoolib from other Python applications to handle archives.

Log output will be on sys.stdout and sys.stderr. On errors, PatoolError will be raised. Note that extra options or customization for specific archive programs are not supported.

The following functions are currently supported as an API:

patoolib.list_formats() None

Print information about available archive formats to stdout.

Returns:

None

Return type:

None

patoolib.supported_formats(operations: Sequence[str] = ('list', 'extract', 'test', 'create')) list[str]

Return a list of supported archive formats for an iterable of operations.

Parameters:

operations (List|Tuple|Set|Dict[str]) – The operations to check for, defaults to ArchiveCommands.

Returns:

A list of supported archive formats.

Return type:

List[str]

patoolib.list_archive(archive: str, verbosity: int = 1, program: str | None = None, interactive: bool = True, password: str | None = None) None

List given archive.

Example: patoolib.list_archive(“package.deb”)

Parameters:
  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • program (str or None) – If None (the default), a list of suitable archive programs are checked if they exist in the system search path (defined by the PATH environment variable). If a program name is given, it is added to the list of programs that is searched for. The program should be a relative or absolute path name to an executable.

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – If an archive does not exist or is not a regular file, or on errors while listing.

Returns:

None

Return type:

None

patoolib.extract_archive(archive: str, verbosity: int = 0, outdir: str | None = None, program: str | None = None, interactive: bool = True, password: str | None = None) str

Extract an archive file.

Extracting never overwrites existing files or directories. The original archive file is kept after extraction, even if all files were successful extracted.

Example: patoolib.extract_archive(“archive.zip”, outdir=”/tmp”)

Parameters:
  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • outdir (str or None) – The directory where the archive should be extracted. A value of None (the default) uses the current working directory.

  • program (str or None) – If None (the default), a list of suitable archive programs are checked if they exist in the system search path (defined by the PATH environment variable). If a program name is given, it is added to the list of programs that is searched for. The program should be a relative or absolute path name to an executable.

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – If an archive does not exist or is not a regular file, or on errors while extracting.

Returns:

The directory where the archive has been extracted.

Return type:

str

patoolib.test_archive(archive: str, verbosity: int = 0, program: str | None = None, interactive: bool = True, password: str | None = None) None

Test given archive.

Example: patoolib.test_archive(“dist.tar.gz”, verbosity=1)

Parameters:
  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • program (str or None) – If None (the default), a list of suitable archive programs are checked if they exist in the system search path (defined by the PATH environment variable). If a program name is given, it is added to the list of programs that is searched for. The program should be a relative or absolute path name to an executable.

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – If an archive does not exist or is not a regular file, or on errors while testing.

Returns:

None

Return type:

None

patoolib.create_archive(archive: str, filenames, verbosity: int = 0, program: str | None = None, interactive: bool = True, password: str | None = None) None

Create given archive with given files.

Example: patoolib.create_archive(“/path/to/myfiles.zip”, (“file1.txt”, “dir/”))

Parameters:
  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • filenames (tuple of str) – A list of filenames to add to the archive. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • program (str or None) – If None (the default), a list of suitable archive programs are checked if they exist in the system search path (defined by the PATH environment variable). If a program name is given, it is added to the list of programs that is searched for. The program should be a relative or absolute path name to an executable.

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – on errors while creating the archive

Returns:

None

Return type:

None

patoolib.diff_archives(archive1: str, archive2: str, verbosity: int = 0, interactive: bool = True) int

Compare two archives and print their differences.

Both archives will be extracted in temporary directories. Both directory contents will be compared recursively with the diff(1) tool.

Example: patoolib.diff_archives(“release1.0.tar.gz”, “release2.0.zip”)

Parameters:
  • archive1 (str) – The first archive filename. Can be relative to the current working directory or absolute.

  • archive2 (str) – The second archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

Raises:

patoolib.PatoolError – on errors while comparing the archives.

Returns:

None

Return type:

None

patoolib.search_archive(pattern: str, archive: str, verbosity: int = 0, interactive: bool = True, password: str | None = None) int

Search pattern in archive members.

The archive will be extracted in a temporary directory. The directory contents will then be searched with the grep(1) tool.

Example: patoolib.search_archive(“def urlopen”, “python3.3.tar.gz”)

Parameters:
  • pattern (str) – The pattern to search for. See the grep(1) manual page for pattern syntax.

  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – on errors while extracting or searching the archive

Returns:

exit code of the grep program

Return type:

int

patoolib.repack_archive(archive: str, archive_new: str, verbosity: int = 0, interactive: bool = True, password: str | None = None) None

Repack archive to different file and/or format.

The archive will be extracted and recompressed to archive_new.

Example: patoolib.repack_archive(“linux-2.6.33.tar.gz”, “linux-2.6.33.tar.bz2”)

Parameters:
  • archive (str) – The archive filename. Can be relative to the current working directory or absolute.

  • archive_new (str) – The new archive filename. Can be relative to the current working directory or absolute.

  • verbosity (int) – larger values print more information. 0 is the default, -1 or lower means no output, values >= 1 prints command output

  • interactive (bool) – If True (the default), wait for user input if the extraction program asks for it. This should be set to True if you intend to type in a password interactively. If set to False, standard input will be set to an empty string to prevent simple hangs from programs requiring input.

  • password (str or None) – If an archive is encrypted, set the given password with command line options. Note that the password might be written to logs that keep track of your command line history. If an archive program does not support passwords this option is ignored by patool.

Raises:

patoolib.PatoolError – on errors while extracting or creating the archive

Returns:

None

Return type:

None

patoolib.is_archive(filename: str) bool

Detect if the file is a known archive.

Example: patoolib.is_archive(“package.deb”)

Parameters:

filename (str) – The filename to check. Can be relative to the current working directory or absolute.

Returns:

True if given filename is an archive file.

Return type:

bool