SciWork 2020 Packaging Tutorial¶
This tutorial walks you through how to package a simple Python project. It will show you how to add the necessary files and structure to create the package, how to build the package, and how to upload it to a package index (such as PyPI).
Set up a Development Environment¶
Please have the followings ready before you start:
- Any Python installation with pip available.
- Conda from either Miniconda or Anaconda.
Please arrive early to the tutorial if you are not sure whether you have all things set up. The lecturer will help get things ready.
Install development tools¶
Run the following command to install/update the needed development tools.
Anaconda:
conda install conda-build
On Windows:
py -3 -m pip install --upgrade pip setuptools wheel twine
On Mac or Linux:
python3 -m pip install --upgrade pip setuptools wheel twine
Note
For brevity, we will use py -3
from here on. Please substitute it to
python3
yourself as needed, if you used python3
for the above
command.
Register an account on Test PyPI¶
Register an account at https://test.pypi.org/account/register/.
Follow the steps to complete registration. You will also need to verify your email address to be able to upload packages.
Put code in a directory¶
This tutorial uses a simple project named sampleproject
. The complete
source is available at https://github.com/pypa/sampleproject.git. We will be
creating the project from scratch, but you can use the repository as a
reference if anything is not working.
If you already have a project that you want to package up, and feel confident
to do so, simply replace the sample
directory with the one you want to use,
and replace the metadata as appropriate.
We start with the following file structure:
sampleproject/
sample/
__init__.py
All of the commands in this tutorial will need to be run within the top-level
directory just created. Be sure to cd sampleproject
into the project
directory to run following commands successfully.
Note
Put some code in __init__.py
so it is easy to test. For example:
def greet():
print("Hello!")
And test it like this:
$ py -3 -c "import sample; sample.greet()"
Hello!
Describe the project¶
Create some additional files in the project root directory, alongside the
sample
directory. You should end up with the following structure:
sampleproject-YOUR-USERNAME/
sample/
__init__.py
pyproject.toml
README.md
setup.cfg
setup.py
Creating README.md¶
A README file is essential for people (including yourself in the future) to understand what the project is doing. You can put some text inside this file to describe the project, using the GitHub-flavored Markdown syntax. Put in the following content if you are not sure what to write:
# Example Package
This is a simple example package. You can use
GitHub-flavored Markdown to write your content.
Creating pyproject.toml¶
Add the following content:
This file describes how we want to package the project. We will be using Setuptools, a packaging tool maintained by the Python Packaging Authority (PyPA).
Creating setup.cfg¶
This file tells Setuptools about the project (such as the name and version), as well as which code files to include.
Change the name
value to include your username (for example,
sampleproject-uranusjr
), to make the proejct name unique, and does not
conflict with other people when you upload it. Or you can use any name you like
(as long as it is not already registered—search on the PyPI to find out).
[metadata]
name = sampleproject-YOUR-USERNAME
version = 0.0.1
author = Example Author
author_email = author@example.com
description = A small example package
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/pypa/sampleproject
license = MIT
classifier =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
packages = find:
python_requires = >=3.6
Here is a run-down of the configuration:
name
is the distribution name of your package. This can be any name as long as only contains letters, numbers,_
, and-
. It also must not already be taken on the PyPI. Be sure to update this with your username, as this ensures you won’t try to upload a package with the same name as one which already exists when you upload the package.version
is the package version. See PEP 440 for more details on versions.author
andauthor_email
are used to identify the author of the package.description
is a short, one-sentence summary of the package.long_description
is a detailed description of the package. This is shown on the package detail package on the PyPI. Here, we tell Setuptools to use the content ofREADME.md
, which is a common pattern.long_description_content_type
tells the index what type of markup is used for the long description. In this case, it’s Markdown.url
is the URL for the homepage of the project. For many projects, this will just be a link to GitHub, GitLab, Bitbucket, or similar code hosting service.license
describes what license this project uses—Here we use MIT, but there are many more choices available. Omit this field if you do not intend your code to be open source.classifiers
gives the index and pip some additional metadata about your package. In this case, the package described as- Only compatible with Python 3
- Uses the MIT license
- Does not care about operating systems
A complete list of classifiers can be found at https://pypi.org/classifiers/.
packages
is a list of all Python code that should be included in the distribution package. Instead of listing each package manually, we can use the find derivative to automatically discover all packages and subpackages. This is also a common pattern unless you have an uncommon project layout.python_requires
descibes what versions of Python this project is compatible with. Here, we only allow the project to be installed on Python 3.6 or later.
The keys listed above is a relatively minimal set, and there a a few more you can specify. Visit the documentation on setup.cfg to find a comprehensive list of available configurations.
Creating setup.py¶
setup.py
is a script to call Setuptools. It can be used to include custom
logic to build the project, but we are using all defaults here. Simply put:
import setuptools
setuptools.setup()
Distribute the Project¶
With the metadata declared, we can now ask Setuptools to generate distribution packages, which are archives containing files to be installed on the target machine.
Build the distributions¶
Use the following command to build distributions:
py -3 setup.py sdist bdist_wheel
This command should output a lot of text. Once completed, it should generate two files in the dist directory:
dist/
sampleproject_YOUR_USERNAME-0.0.1-py3-non3-any.whl
sampleproject-YOUR-USERNAME-0.0.1.tar.gz
The tar.gz
file is a source distribution (sdist), and the .whl
file
is a built distribution in the wheel format. Newer pip versions
prefer to install built distributions, but will fall back to sdist if there are
no built distributions compatible with the user’s platform. In this case, our
example package is compatible with Python on any platform, so only one built
distribution is needed.
Warning
If you made mistakes in your setup.cfg
and want to change it, make sure
to delete previously-generated metadata before running setup.py
again.
On Windows:
rmdir /q /s build dist
del sampleproject_YOUR_USERNAME.egg-info
On macOS and Linux:
rm -rf build dist sampleproject_YOUR_USERNAME.egg-info
Upload the distributions¶
Now we can upload the files to the package index. For demostration purposes, we will be uploading to Test PyPI. This is a separate instance of the package index, intended for testing and experimentation. This is great for things like this tutorial, where we don’t necessarily want to upload “for real.”
Make sure to Register an account on Test PyPI before you continue with the tutorial.
Use the following command to upload the distributions built in the previous section:
py -3 -m twine upload --repository-url https://test.pypi.org/legacy/ dist/*
Enter your username and password when prompted.
Once the command finished successfully, your package should be viewable on Test PyPI at e.g. https://test.pypi.org/project/sampleproject-YOUR-USERNAME.
Congratulations, you have successfully packaged and distributed a Python project!
Test package installation¶
The package uploaded to Test PyPI can be installed with pip, like you would any other package, by explicitly specifying the “index” to install from:
py -3 -m pip install --index-url https://test.pypi.org/simple sampleproject-YOUR-USERNAME
Now try to use it by running py -3
:
>>> import sample
>>> sample.greet()
Hello!
Optional: Release your package to PyPI.org¶
By uploading the package to PyPI.org (instead of Test PyPI), you are telling the world the package is ready for download. The steps are simple—Register an accout on PyPI.org, build the distributions like before, and change the repository URL in the upload command:
py -3 -m twine upload --repository-url https://pypi.org/legacy/ dist/*
The only difference is to use pypi.org
(removing the test.
part).
That’s it! Now people can install your package directly:
py -3 -m pip install sampleproject-YOUR-USERNAME
There are some caveats though:
- Files uploaded to PyPI are immutable. More especifically, although deletion is possible, your cannot re-upload a file. So be extra careful before you release to PyPI. The only way to override a mistake on PyPI is to release a new version.
- Following the previous point, it is usually a good idea to delete the
build
,dist
, and.egg-info
directories every time you want to release a new version, to make sure the files are all in the newest version. - Accounts and packages on Test PyPI and (real) PyPI are all distinct. You do not automatically own a package on PyPI by uploading it to Test PyPI, and vice versa.
Distribute a Conda Package¶
As with Setuptools, we also need to provide both some description and build instruction for a Conda package. More specifically, three files are needed:
meta.yaml
build.sh
blt.bat
Put them in the project root (alongside with other metadata files such as
README.md
, pyproject.toml
).
Creating meta.yaml¶
This is used to describe the package.
package:
name: sampleproject-YOUR-USERNAME
version: "0.0.1"
about:
home: https://github.com/pypa/sampleproject
license: MIT
summary: A small example package
source:
fn: sampleproject-YOUR-USERNAME-0.0.1.tar.gz
url: https://test.pypi.org/packages/source/s/sampleproject-YOUR-USERNAME/sampleproject-YOUR-USERNAME-0.0.1.tar.gz
md5: "39180d64b5021c5399a2568fe2103cd2"
requirements:
build:
- pip >=19
- python >=3.6
run:
- python >=3.6
Note
Remember to replace YOUR-USERNAME
with your own name.
Warning
Make sure to quote the version
and md5
fields! Otherwise YAML may
interpret them as numbers. Also, the extension is yaml
, not yml
.
Here, we instruct Conda about basic project information (package
and
about
sections), where to download it (source
), and what other packages
are needed in order to build and run it.
Unlike PyPI, Conda does not allow uploading source code (only built files), so
you need to upload your code somewhere else, and use meta.yaml
to tell
Conda where it is.
Here, we are re-using our data on Test PyPI (or the real PyPI, if you
eventually release the package to the public). The package URL is predictable,
and based on your package name (the first /s/
part is the package name’s
first letter).
The MD5 value is used by Conda to make sure it downloads the correct file. It can be calculated using a local command.
On Windows:
fciv -md5 dist\sampleproject-YOUR-USERNAME-0.0.1.tar.gz
On macOS:
md5 dist/sampleproject-YOUR-USERNAME-0.0.1.tar.gz
On Linux:
md5sum dist/sampleproject-YOUR-USERNAME-0.0.1.tar.gz
Alternatively, you can also view this on your PyPI project page at
https://test.pypi.org/project/sampleproject-YOUR-USERNAME/#files. Click on the
View button beside the .tar.gz
file, and copy the hash digest field on
the MD5 row.
Creating build scripts¶
Conda packages use two files to build: build.sh
and bld.bat
. These two
are basically the same thing, but one for installing on Windows, and one for on
Linux and macOS. It is highly recommended you provide both if possible
(unless you really don’t want the package to be installed on a platform).
build.sh
:
"$PYTHON" -m pip install --no-deps .
bld.bat
:
"%PYTHON%" -m pip install --no-deps .
exit %errorlevel%
Building the package¶
In the project root (where meta.yaml
is), use conda build
to generate a
Conda package:
conda build .
This will generates a lot of text. When it finished, you should see something like the following near the end:
TEST START: ~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
Nothing to test for: ~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
# Automatic uploading is disabled
# If you want to upload package(s) to anaconda.org later, type:
anaconda upload ~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
This means that the package has been built successfully. The TEST START
and
anaconda upload
lines also shows the location of the package; in this case
it is:
~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
Save this path somewhere; it will be useful in the following (optional) sections.
Testing the Conda package¶
You can try installing the newly built package into your Conda environment:
conda install --use-local sampleproject-YOUR-USERNAME
The --use-local
flag instructs Conda to install the local conda-build
channel on your computer, rather than the default Anaconda or conda-forge.
If installed successfully, the package will be present in conda list
.
Optional: Building for a different Python version¶
The package built above was against Python 3.8 (the file name contains a
py38
part). Conda by default builds packages for the version of Python
installed in the root environment. To build packages for other versions of
Python, use the --python
flag followed by a version:
conda build --python 3.6 sampleproject-YOUR-USERNAME
Notice that the file printed at the end of the output would change to reflect
the requested version of Python. When you conda install
, it will look in
the package directory for the file that matches your current Python version.
Optional: Converting a package for use on all platforms¶
Similar to the Python version, Conda by default requires you to build the
package on each platform, to make sure the uploaded files are always correctly
built. We built against linux-64
(Linux, 64-bit) in the above example.
Sometimes, however, you are very sure the package works on all platforms.
Conda can convert the package for you in this case:
conda convert --platform all ~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
Replace the final path with your package location saved above.
Optional: Uploading to Anaconda.org¶
After converting your files for use on other platforms, you may choose to upload your files to Anaconda.org. You will need to register an account (sign up) on the website first.
Make sure you have the Anaconda Client installed:
conda install anaconda-client
And log in (sign in) into your registered Anaconda account:
anaconda login
After that, you can use the command to upload packages:
anaconda upload ~/miniconda/conda-bld/linux-64/sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
Replace the final path with your package location saved above. Packages built
against another Python, or converted by conda convert
can also be uploaded
in the same way, to make your package support multiple Python versions and
platforms.
Distribute Private Projects¶
We have been talking about how to release your package to the public, and it is a common misconception that package publishing is only for open source.
While it is true that Python package managers are more focused on working with open source (because they are part of it), they are also used by a lot of people to publish packages in private settings, so various projects in a company can easily share code, deploy software, and manage version upgrades.
Custom Conda channels¶
In Conda, you can choose to install packages from a specific channel, which
means Conda will look at that location for the package.
Anaconda for Enterprise offers a private channel service (i.e. publish
packages that only authorised people can download), but it is also easy to
run a channel server yourself, either with HTTP(S), or even with a shared disk
that’s accessible with file://
.
A Conda channel is structured like this:
channel/
linux-64/
... packages
linux-32/
... packages
osx-64/
... packages
win-64/
... packages
win-32/
... packages
The root directory can be any directory on disk, and can take any name, but
here we use channel
as an example. Each directory inside root represents
a platform to support; you can omit any you do not plan to support. Package
files (like our previous sampleproject-YOUR-USERNAME-0.0.1-py38_0.tar.bz2
)
are put inside each directory to make them downloadable.
Run the following command to generate a repository listing:
conda index channel/
This will generate a file repodata.json
in each repository directory, which
Conda uses to get the metadata for the packages in the channel.
Note
Re-run conda index
each time you add or modify anything in the channel
(e.g. release a new package version), to keep the metadata up-to-date.
To install from the custom index, use the following syntax:
conda install --channel=file://path/to/channel/ sampleproject-YOUR-USERNAME
You can also set up the channel in your .condarc
to avoid specifying the
channel every time.
“Find-links” pages¶
TODO…
devpi¶
TODO…
Further Guides¶
TODO…