A module is a file. A package is a folder of related modules — a way to group code as your project grows past a handful of files.

What makes a folder a package?

A folder becomes a package when it contains an __init__.py file:

my_project/
├── main.py
└── data_tools/
    ├── __init__.py
    ├── load.py
    └── clean.py

data_tools/ is now a package. Inside it, load.py and clean.py are modules.

The __init__.py can be empty — its presence alone tells Python “this is a package”.

Importing from a package

from data_tools.load import load_csv
from data_tools.clean import remove_nulls

Or with the dotted form:

import data_tools.load
import data_tools.clean

data_tools.load.load_csv("file.csv")

Or with as:

import data_tools.load as loader
loader.load_csv("file.csv")

The . is how you reach into a package’s modules. data_tools.clean reads as “the clean module inside the data_tools package”.

Nested packages

Packages can contain other packages. Just put an __init__.py at every level:

my_project/
└── data_tools/
    ├── __init__.py
    ├── load.py
    └── transforms/
        ├── __init__.py
        ├── normalise.py
        └── encode.py
from data_tools.transforms.normalise import min_max_scale

This is exactly how big libraries (NumPy, PyTorch) are structured — packages inside packages, each with focused submodules.

What goes in init.py?

For small projects: nothing — keep it empty. It’s just a marker.

For larger projects, __init__.py is often used to re-export the package’s public API:

from data_tools.load import load_csv
from data_tools.clean import remove_nulls

__all__ = ["load_csv", "remove_nulls"]

Now callers can write from data_tools import load_csv instead of from data_tools.load import load_csv. The __all__ list documents what’s “public”.

This is convenience, not necessity. Start with empty __init__.py files and only add re-exports when the project is big enough that callers benefit.

Relative imports — inside the same package

Within a package, you can import sibling modules with a leading dot:

from .load import load_csv     # same as: from data_tools.load import load_csv

. means “the current package”. .. means “one level up”.

When to use relative vs absolute imports:

  • Absolute (from data_tools.load import ...) — clearer, works from any context. Prefer this in most code.
  • Relative (from .load import ...) — shorter, useful when you might rename the package later.

Both work. Pick one style per project.

Running a script inside a package

If your script is part of a package and uses relative imports, you can’t run it directly:

uv run python data_tools/load.py
# ImportError: attempted relative import with no known parent package

Instead, run it as a module from the package’s root:

uv run python -m data_tools.load

The -m flag tells Python “treat this as a module of the package”, not “run this as a script”. Relative imports then work.

A realistic package shape

my_project/
├── pyproject.toml
├── README.md
└── src/
    └── my_project/
        ├── __init__.py
        ├── main.py
        ├── data/
        │   ├── __init__.py
        │   ├── load.py
        │   └── clean.py
        └── models/
            ├── __init__.py
            └── train.py

This is the src layout — popular in modern Python projects. The package source lives in src/my_project/, separate from project files like README.md and pyproject.toml.

For learning, a flat layout is fine. The src layout pays off when you start publishing your package or running automated tests.

pycache

After importing a package, you’ll see __pycache__/ folders appear. These hold compiled bytecode for faster re-imports. We met them in Section 1 — they’re safe to ignore and should be added to .gitignore.

uv init adds __pycache__/ to your .gitignore for you.

What’s next

A short tour of the standard library — the modules that come with Python and what each is good for.

Toggle theme (T)