Skip to content

Structuring Your Python Project: A Best Practice Guide

A well-structured project is the cornerstone of maintainable, scalable, and collaborative software. This document aims to provide a modern best-practice guide on how to structure a typical Python project.

References:


1. Why is Project Structure Important?

Good project structure is about more than just aesthetics. When a potential contributor or user lands on your repository, a clear structure is the first step for them to understand your project. More importantly, in the long run, a logically organized structure enables:

  • Reduced Cognitive Load: Allows team members to quickly locate the code they need to find or modify.
  • Simplified Dependencies and Imports: Avoids complex relative import and path issues.
  • Easier Automation: Makes automated processes like testing, building, and deployment easier to configure.

2. The "Gold Standard" Project Structure Example

The following is a widely recognized project structure suitable for most small to medium-sized Python applications or libraries.

my_project/
├── .gitignore          # List of files for Git to ignore
├── docs/               # Documentation directory
│   ├── conf.py
│   └── index.rst
├── src/                # Source code directory (src-layout)
│   └── my_package/     # Your Python package
│       ├── __init__.py
│       ├── module1.py
│       └── module2.py
├── tests/              # Test directory
│   ├── test_module1.py
│   └── test_module2.py
├── LICENSE             # Project license
├── Makefile            # (Optional) Task runner
├── pyproject.toml      # The core configuration file for modern Python projects
└── README.md           # Project description

3. Detailed Breakdown of Each Part

README.md

This is the front page of your project. It should clearly explain:

  • What the project does.
  • How to install and configure it.
  • A quick-start usage example.
  • How to contribute to the project.

LICENSE

A legal document that defines how others can use, modify, and distribute your code. If you're unsure which license to use, visit choosealicense.com. The absence of a license prevents many people from using your code with confidence.

.gitignore

Tells Git which files or directories should not be included in version control. A typical Python .gitignore file would include:

  • Virtual environment directories (.venv/, env/)
  • Python cache files (__pycache__/, *.pyc)
  • IDE and OS-generated files (.idea/, .vscode/, .DS_Store)
  • Build artifacts (build/, dist/, *.egg-info)

pyproject.toml

This is the core of a modern Python project. According to PEP 518 and PEP 621, this file unifies the project's build information and metadata, replacing the old combination of setup.py, setup.cfg, and requirements.txt.

It should contain:

  • Project Metadata: The [project] table, including name, version, description, authors, license, etc.
  • Project Dependencies: The [project.dependencies] list, defining the libraries required for the project to run.
  • Development Dependencies: Usually defined in a group named dev or test under [project.optional-dependencies].
  • Build System Information: The [build-system] table, specifying the tools required to build the project (e.g., poetry-core or setuptools).

src/ Directory Layout (Src Layout)

This is a key practice in modern Python project structuring: placing your main source code package inside a src directory.

Why use a src layout?

  1. Avoids Accidental Imports: If your package is in the root directory, you might accidentally import it via a relative path during development, even if it's not properly installed. This will cause an ImportError when someone else tries to install and use your package via pip. The src layout forces you to install your project in editable mode (pip install -e .) for local development, thus ensuring your test environment behaves identically to a user's installation environment.
  2. Clear Separation of Concerns: It clearly separates your source code from other parts of the project like docs, tests, and configuration files.

tests/ Directory

This is where all your test code should live.

  • Separated from Source: Keeping tests in a top-level tests directory, rather than inside your package, prevents them from being accidentally included in your final distribution package.
  • Running Tests: You can use a tool like pytest to automatically discover and run all tests within the tests directory.

docs/ Directory

This is for your project's detailed documentation. It's common to use Sphinx to generate HTML documentation, which can automatically extract API references from your code's docstrings.

Makefile (Optional)

Although make was originally designed for C projects, it's an incredibly convenient general-purpose task runner. You can use it to define a series of shortcuts for common project commands.

A simple Makefile might look like this:

makefile
.PHONY: install test docs clean

install:
	# Install development dependencies
	pip install -e ".[dev]"

test:
	# Run tests
	pytest

docs:
	# Build documentation
	sphinx-build docs/ docs/_build

clean:
	# Clean up build cache
	rm -rf build/ dist/ .eggs/ __pycache__/

This way, you only need to run simple commands like make install, make test, etc., without having to remember the full command lines.


4. Advanced Structure: Managing Multiple Packages in a Single Repository

As projects grow very large, you might encounter a more complex scenario: maintaining and distributing multiple installable packages from a single repository (a "Monorepo"), while they share the same src directory. For example, acme.core and acme.client.

For this situation, Python provides the Namespace Packages mechanism.

Core Concept: Namespace Packages

  • Difference from Regular Packages:

    • A Regular Package must contain an __init__.py file in its directory.
    • A Namespace Package, in contrast, is a directory that must not contain an __init__.py file at its top level.
  • How it Works: When the Python interpreter encounters a directory without an __init__.py, it treats it as a namespace. This allows multiple physically separate directories to contribute to the same logical package name.

Structure Example

Let's assume we have an acme namespace containing two independent sub-packages, core and client.

my_monorepo/
├── src/
│   └── acme/             # Top-level of the namespace package (NO __init__.py)
│       ├── core/         # acme.core sub-package (a regular package)
│       │   ├── __init__.py
│       │   └── logic.py
│       └── client/       # acme.client sub-package (a regular package)
│           ├── __init__.py
│           └── app.py

├── pyproject.toml        # Config file for building acme.core
└── pyproject.client.toml # Config for acme.client (one possible way to organize)

The Build Challenge and Solution

The Challenge: The standard specifies that one pyproject.toml file defines one project. So how do we build two different distribution packages from the same source tree?

The Solution: The prevailing practice is to create separate build configurations for each distributable sub-package. While pyproject.toml itself doesn't support defining multiple projects in one file, we can leverage the flexibility of build backends to achieve this.

Method: Use a Separate pyproject.toml for Each Package

This is the cleanest and most standard approach. You can create a dedicated directory for each sub-package to hold its build configuration.

my_monorepo/
├── .git/
├── src/
│   └── acme/
│       ├── core/
│       │   └── ...
│       └── client/
│           └── ...

├── packages/             # Create a directory for each distributable package
│   ├── acme-core/
│   │   └── pyproject.toml
│   └── acme-client/
│       └── pyproject.toml

└── README.md

Then, in each sub-package's pyproject.toml, you need to tell the build backend (e.g., hatchling, setuptools) where to find the source code.

packages/acme-core/pyproject.toml Example (with hatchling):

toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "acme.core"
version = "0.1.0"
# ... other metadata ...

[tool.hatch.build.targets.wheel]
# Explicitly tell hatchling to only package src/acme/core
packages = ["../../src/acme/core"]

This way, when you navigate into the packages/acme-core/ directory and run the build command (e.g., python -m build), it will only package the code under src/acme/core into the acme.core distribution. The configuration for acme.client would be similar.

Clarification on a "Unified Export" File

Python does not have a direct equivalent to JavaScript's index.js for a "unified export" file.

  • The __init__.py file serves as the entry point and facade for a single package, not for the entire src directory.
  • In the multi-package scenario above, src/acme/core/__init__.py defines the public API for the acme.core package, while src/acme/client/__init__.py defines the public API for the acme.client package.
  • There is no single file that can export members from both acme.core and acme.client simultaneously. They are two separate worlds that will ultimately be installed into the same acme namespace in a user's environment.