4. Creating a Source Distribution

Python 2.7.10

4. Creating a Source Distribution

As shown in section A Simple Example, you use the sdist command to create a source distribution. In the simplest case,

python setup.py sdist

(assuming you haven’t specified any sdist options in the setup script or config file), sdist creates the archive of the default format for the current platform. The default format is a gzip’ed tar file (.tar.gz) on Unix, and ZIP file on Windows.

You can specify as many formats as you like using the --formats option, for example:

python setup.py sdist --formats=gztar,zip

to create a gzipped tarball and a zip file. The available formats are:

Format Description Notes
zip zip file (.zip) (1),(3)
gztar gzip’ed tar file (.tar.gz) (2)
bztar bzip2’ed tar file (.tar.bz2)  
ztar compressed tar file (.tar.Z) (4)
tar tar file (.tar)  

Notes:

  1. default on Windows
  2. default on Unix
  3. requires either external zip utility or zipfile module (part of the standard Python library since Python 1.6)
  4. requires the compress program.

When using any tar format (gztar, bztar, ztar or tar) under Unix, you can specify the owner and group names that will be set for each member of the archive.

For example, if you want all files of the archive to be owned by root:

python setup.py sdist --owner=root --group=root

4.1. Specifying the files to distribute

If you don’t supply an explicit list of files (or instructions on how to generate one), the sdist command puts a minimal default set into the source distribution:

  • all Python source files implied by the py_modules and packages options
  • all C source files mentioned in the ext_modules or libraries options
  • scripts identified by the scripts option See Installing Scripts.
  • anything that looks like a test script: test/test*.py (currently, the Distutils don’t do anything with test scripts except include them in source distributions, but in the future there will be a standard for testing Python module distributions)
  • README.txt (or README), setup.py (or whatever you called your setup script), and setup.cfg
  • all files that matches the package_data metadata. See Installing Package Data.
  • all files that matches the data_files metadata. See Installing Additional Files.

Sometimes this is enough, but usually you will want to specify additional files to distribute. The typical way to do this is to write a manifest template, called MANIFEST.in by default. The manifest template is just a list of instructions for how to generate your manifest file, MANIFEST, which is the exact list of files to include in your source distribution. The sdist command processes this template and generates a manifest based on its instructions and what it finds in the filesystem.

If you prefer to roll your own manifest file, the format is simple: one filename per line, regular files (or symlinks to them) only. If you do supply your own MANIFEST, you must specify everything: the default set of files described above does not apply in this case.

Changed in version 2.7: An existing generated MANIFEST will be regenerated without sdist comparing its modification time to the one of MANIFEST.in or setup.py.

Changed in version 2.7.1: MANIFEST files start with a comment indicating they are generated. Files without this comment are not overwritten or removed.

Changed in version 2.7.3: sdist will read a MANIFEST file if no MANIFEST.in exists, like it did before 2.7.

See The MANIFEST.in template section for a syntax reference.

4.3. The MANIFEST.in template

A MANIFEST.in file can be added in a project to define the list of files to include in the distribution built by the sdist command.

When sdist is run, it will look for the MANIFEST.in file and interpret it to generate the MANIFEST file that contains the list of files that will be included in the package.

This mechanism can be used when the default list of files is not enough. (See Specifying the files to distribute).

4.3.1. Principle

The manifest template has one command per line, where each command specifies a set of files to include or exclude from the source distribution. For an example, let’s look at the Distutils’ own manifest template:

include *.txt
recursive-include examples *.txt *.py
prune examples/sample?/build

The meanings should be fairly clear: include all files in the distribution root matching *.txt, all files anywhere under the examples directory matching *.txt or *.py, and exclude all directories matching examples/sample?/build. All of this is done after the standard include set, so you can exclude files from the standard set with explicit instructions in the manifest template. (Or, you can use the --no-defaults option to disable the standard set entirely.)

The order of commands in the manifest template matters: initially, we have the list of default files as described above, and each command in the template adds to or removes from that list of files. Once we have fully processed the manifest template, we remove files that should not be included in the source distribution:

  • all files in the Distutils “build” tree (default build/)
  • all files in directories named RCS, CVS, .svn, .hg, .git, .bzr or _darcs

Now we have our complete list of files, which is written to the manifest for future reference, and then used to build the source distribution archive(s).

You can disable the default set of included files with the --no-defaults option, and you can disable the standard exclude set with --no-prune.

Following the Distutils’ own manifest template, let’s trace how the sdist command builds the list of files to include in the Distutils source distribution:

  1. include all Python source files in the distutils and distutils/command subdirectories (because packages corresponding to those two directories were mentioned in the packages option in the setup script—see section Writing the Setup Script)
  2. include README.txt, setup.py, and setup.cfg (standard files)
  3. include test/test*.py (standard files)
  4. include *.txt in the distribution root (this will find README.txt a second time, but such redundancies are weeded out later)
  5. include anything matching *.txt or *.py in the sub-tree under examples,
  6. exclude all files in the sub-trees starting at directories matching examples/sample?/build—this may exclude files included by the previous two steps, so it’s important that the prune command in the manifest template comes after the recursive-include command
  7. exclude the entire build tree, and any RCS, CVS, .svn, .hg, .git, .bzr and _darcs directories

Just like in the setup script, file and directory names in the manifest template should always be slash-separated; the Distutils will take care of converting them to the standard representation on your platform. That way, the manifest template is portable across operating systems.

4.3.2. Commands

The manifest template commands are:

Command Description
include pat1 pat2 ... include all files matching any of the listed patterns
exclude pat1 pat2 ... exclude all files matching any of the listed patterns
recursive-include dir pat1 pat2 ... include all files under dir matching any of the listed patterns
recursive-exclude dir pat1 pat2 ... exclude all files under dir matching any of the listed patterns
global-include pat1 pat2 ... include all files anywhere in the source tree matching — & any of the listed patterns
global-exclude pat1 pat2 ... exclude all files anywhere in the source tree matching — & any of the listed patterns
prune dir exclude all files under dir
graft dir include all files under dir

The patterns here are Unix-style “glob” patterns: * matches any sequence of regular filename characters, ? matches any single regular filename character, and [range] matches any of the characters in range (e.g., a-z, a-zA-Z, a-f0-9_.). The definition of “regular filename character” is platform-specific: on Unix it is anything except slash; on Windows anything except backslash or colon.