A file path is the address of a file on disk: data/users.csv, /home/me/code, C:\Users\You\file.txt. Working with paths used to be a mess of string manipulation. pathlib is the modern, clean way to do it.

We’ve seen Path already in earlier lessons. This one covers what it can do.

Why not just use strings?

Strings as paths look fine — until they don’t:

# what's wrong here?
path: str = "data/" + name + ".txt"     # OK on Linux/Mac, breaks on Windows
path: str = directory + "/" + filename  # forgot trailing slash? doubled slash?

pathlib solves all of this:

from pathlib import Path

path: Path = Path("data") / f"{name}.txt"     # works everywhere

The / operator joins parts of a path correctly for whatever OS the program is running on.

Creating a Path

from pathlib import Path

p: Path = Path("data/users.csv")
home: Path = Path.home()                    # your home directory
cwd: Path = Path.cwd()                       # current working directory

A Path is an object that represents a path. It doesn’t matter whether the file exists — Path itself is just a value.

Joining paths

Use the / operator:

base: Path = Path("/home/me")
file: Path = base / "data" / "users.csv"
print(file)     # /home/me/data/users.csv

Mixing strings and Paths is fine — / accepts either side.

Parts of a path

p: Path = Path("/home/me/data/users.csv")

print(p.name)          # 'users.csv'        — filename
print(p.stem)          # 'users'             — filename without extension
print(p.suffix)        # '.csv'              — extension
print(p.parent)        # PosixPath('/home/me/data')   — folder
print(p.parts)         # ('/', 'home', 'me', 'data', 'users.csv')

Every property is a clean named attribute. Compare with manual string slicing — pathlib is much harder to get wrong.

Reading and writing — shortcuts

Path has convenience methods for whole-file reads and writes:

p: Path = Path("notes.txt")

# read
text: str = p.read_text(encoding="utf-8")
data: bytes = p.read_bytes()                 # for binary files

# write
p.write_text("Hello\n", encoding="utf-8")
p.write_bytes(b"binary data")

For one-shot reads and writes, these are shorter than open(...). For line-by-line streaming, you still want with open(p) as f:.

Checking existence and type

p: Path = Path("notes.txt")

print(p.exists())       # True / False
print(p.is_file())      # True if file
print(p.is_dir())       # True if directory

Creating and removing

folder: Path = Path("output/logs/2026")
folder.mkdir(parents=True, exist_ok=True)
  • parents=True — create any missing parent folders too.
  • exist_ok=True — don’t error if the folder already exists.

Removing:

file: Path = Path("output.txt")
file.unlink(missing_ok=True)        # delete a file

folder: Path = Path("temp_dir")
folder.rmdir()                       # delete an empty directory

To delete a non-empty directory, use shutil.rmtree(folder) from the shutil module.

Listing directory contents

for item in Path("data").iterdir():
    print(item)

iterdir() yields all direct children (files and subfolders).

To find specific files, use glob:

for csv_file in Path("data").glob("*.csv"):
    print(csv_file)

# recursive — every csv anywhere below this folder
for csv_file in Path("data").rglob("*.csv"):
    print(csv_file)

The * is a wildcard. ** (used in rglob) means “any folder depth”.

Absolute and relative

A relative path is interpreted from the current working directory. An absolute path is unambiguous.

rel: Path = Path("data/users.csv")
abs_p: Path = rel.resolve()        # makes it absolute
print(abs_p)

When you’re saving or loading data inside a script, prefer absolute paths constructed from a known anchor:

HERE: Path = Path(__file__).resolve().parent     # folder of this script
DATA: Path = HERE / "data"

with open(DATA / "users.csv", encoding="utf-8") as f:
    ...

__file__ is a built-in Python variable holding the path of the current script. Using it makes the script work no matter what directory you run it from.

Walking a tree

To process every file in a folder tree:

for path in Path("data").rglob("*"):
    if path.is_file():
        print(path)

Combined with glob patterns, you can find exactly the files you want with one line.

Summary of Section 10

You can now:

  • Read text files line by line or in one shot
  • Write text files safely with with
  • Read and write CSV files using csv.DictReader and csv.DictWriter
  • Round-trip data through JSON
  • Build, inspect, and walk file paths with pathlib

Together these cover almost every “get data in, write data out” task you’ll meet in ML and data work.

What’s next

Section 11: Modules and Packages — splitting code across files and organising larger programs.

Toggle theme (T)