Write Python Project as a Package

A package is a collection of Python modules that can be imported into Python scripts. Examples include numpy and scipy. The basic usage includes import and from ... import ...

The good thing about the Python package is that it provides a well-organized structure for the Python code to achieve a specific functionality (e.g., implement a project). The scripts in the package should be self-contained and focused on implementing project requirements. They are not general scripts for arbitrary purposes.

A project should have a clear goal and requirements. For example, an ML project aims to implement a new algorithm, while a web development project aims to design a tool to facilitate web page design. When we start any project, since it is not a single-script task, it is recommended to organize the code into a package structure, which facilitates future distribution and development.

It is good practice to write necessary functions and classes to the package and then import and combine them to achieve different purposes. Designing and separating functions into the smallest implementable functions is an art. It also depends on the project.

We refer to the post Coding Practice for Python Projects for a reference structure for Python projects.

src layout vs flat layout. Package Discovery and Namespace Packages

Packaging Configuration

There are three files related to Python project packaging: setup.py, setup.cfg, and pyproject.toml. These files are used to package the Python code into distributable libraries. Packaging is useful when we package our code into Python packages, which are commonly used in editable development, final distribution, etc. If we only write simple scripts for tests, packaging is unnecessary.

When we want to distribute Python code, we need first to package the code to make it into an agreed format and then ship it. The distributed package is also called the library. For python packages, there are two types of distributions: source and binary, see Overview of Python Packaging. Python creates the wheel, a package format to ship libraries with compiled artifacts. Mature Python libraries can be uploaded to PyPI to be found and used by all Python users.

A Brief History of Python Build Tools

This post is helpful. Extra: A History of Python Build Tools

As we see, especially for binary libraries, we need tools to compile it (build it) and then package it. People have developed many tools to achieve this. This tool is also called the Python build tool. At the very beginning, in Python 2.2, distutils was a module of Python’s standard library that allowed users to install and distribute their own packages. Then, it is superseded by setuptools and was deprecated in PEP 632.

To use setuptools to build a Python project, we generally need setup() functions in the module. What we do is that we create a setup.py script in the project which call functions setuptools, and run the script to build our python project. Until now, people have needed to write a Python script to build a project. If we want to change some building parameters to the project, we need to read and understand the setup.py script and change it, which is considered a good style since there is too much logic to configure a project. Therefore, to make the configuration more clear, people extract settings (or options) in the setup.py (more specifically, settings in setup() function) to a new configuration file setup.cfg. Then, it is sufficient to change building options in the configuration file. There is a need to write complex code in setup.py. See What’s the difference between setup.py and setup.cfg in python projects.

However, setuptools is not in the Python’s standard library. It means that if we want to package a Python project, we first need to install the setuptools package and use it to parse the required packages. For example, I created a python package foo, which uses numpy. In order to build the package, I first install setuptools and tell it my required package is numpy. In fact, I need both numpy and setuptools to build my project. In the era of distutils, this was not a problem for Python developers, as distutils was shipped as part of Python’s standard library. Therefore, can we use a configuration file that writes setuptools as a required package? Besides, there are other Python building tools, for example, flit, hatch, pdm, poetry, trampolim, and whey. Can we also use a configuration file that specifies which build tool to use?

This consideration is reflected in PEP 517/518 in 2015, where people tried to use a standardized configuration file pyproject.toml to specify the build configurations. Since majority of Python projects were built by setuptools. First, two configuration files, pyproject.toml and setup.cfg, were used to specify built configurations, where the first specifies using setuptools and the second specifies the setup options.

Now move to 2020, PEP 621 decided to incorporate project metadata (build options) to the pyproject.toml. In this way, there is no need for setup.cfg since everything can be written into a single pyproject.toml. With PEP 660, the Python community standardized a way to use wheel files to create editable installs, and therefore, setup.py is no longer required. Therefore, for the current Python project, we only need to include one pyproject.toml. setup.py and setup.cfg are no longer needed for build configurations. It is also recommended by PyPA that modern Python projects use pyproject.toml for build configurations.

Difference Between Three Files

From the history, we know that

  • setup.py is a Python script for building a Python project using utilities from the package setuptools.
  • setup.cfg is a straightforward configuration file for the setup() function in the setup.py. It is created to reduce the complex logic needed to set build configurations. People can modify configurations directly in this file.
  • pyproject.tmol is a new configuration file that unifies the build end selection and builds configurations. It is recommended to use pyproject.toml for build configuration.

This post is helpful. Understanding setup.py, setup.cfg and pyproject.toml in Python

Usage

Some useful references:

Use setup.py Only

from setuptools import setup, find_packages
setup(
    name='my_proj',
    version='0.0.1',
    description='pip install test with setup.py only',
    packages=find_packages(include=['my_proj', 'my_proj.*']),
    install_requires=[
        'numpy>=1.26.0',
        'scipy>=1.13.0'
    ],
    extras_require={
        'interactive': ['matplotlib>=3.6.0',],
    }
)

Use setup.py and setup.cfg

from setuptools import setup, find_packages
setup()
[metadata]
name = my_proj
version = 0.0.1
description = pip install test with setup.cfg
    
[options]
packages = find:
install_requires =
    numpy >= 1.26.0
    scipy >= 1.13.0

[options.extras_require]
interactive = matplotlib>=3.6.0

[options.packages.find]
include = my_proj, my_proj*

We can find more specifications of setup.cfg in Configuring setuptools using setup.cfg files

Note: We cannot use a single setup.cfg for building. A setup.py file containing a setup() function call is still required even if the configuration resides in setup.cfg. They need to be used together. See Configuring setuptools using setup.cfg files

Use pyproject.toml and setup.cfg

The setup.cfg remains the same as the previous approach.

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

Use pyproject.toml Only

[build-system]
requires = ["setuptools >= 61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "my_proj"
version = "0.0.1"
description = "pip install test with pyproject.toml only."
requires-python = ">=3.8"
dependencies = [
    "numpy>=1.26.0", 
    "scipy>=1.13.0",
]

[project.optional-dependencies]
interactive = ["matplotlib>=3.6.0"]

[tool.setuptools.packages.find]
where = ["."]
include = ["my_proj", "my_proj.*"]