A generator expression is great for one-liners. For anything more complicated, you write a generator function — a regular function that uses the yield keyword instead of return.

The basic idea

def count_up_to(limit: int):
    n: int = 1
    while n <= limit:
        yield n
        n += 1


for value in count_up_to(5):
    print(value)
1
2
3
4
5

The function looks normal, but the moment it contains a yield, Python treats it as a generator function. Calling it doesn’t run the body — it returns a generator object:

gen = count_up_to(5)
print(gen)               # <generator object count_up_to at 0x...>

The body only runs when you ask for a value (via for, next(), etc.).

yield vs return

  • return ends the function and produces one value.
  • yield pauses the function and produces one value. The next time you ask for a value, the function picks up where it left off.
def two_values():
    yield 1
    yield 2


gen = two_values()
print(next(gen))   # 1
print(next(gen))   # 2
print(next(gen))   # StopIteration — function ran out

The first next() runs the function up to the first yield. The second next() resumes it past the first yield until it hits the next one. When the function ends, StopIteration is raised.

Type hints for generators

Use Iterator[T] (or Generator[T, None, None] for the full form):

from collections.abc import Iterator


def count_up_to(limit: int) -> Iterator[int]:
    n: int = 1
    while n <= limit:
        yield n
        n += 1

Iterator[int] says: “this function produces integers when iterated”.

Why bother — vs returning a list?

Compare:

def squares_list(limit: int) -> list[int]:
    result: list[int] = []
    for n in range(1, limit + 1):
        result.append(n * n)
    return result


def squares_gen(limit: int) -> Iterator[int]:
    for n in range(1, limit + 1):
        yield n * n

Both produce the same values when looped. The differences:

  • squares_list(1_000_000) allocates a list of a million numbers.
  • squares_gen(1_000_000) allocates nothing — it produces values as you ask.

For small data both are fine. For big data, the generator wins.

yield inside complex logic

The real power of yield is that you can mix it with normal control flow — if, while, for, even try. A generator function is the natural way to expose a complicated produce-values process:

def read_clean_numbers(lines: Iterator[str]) -> Iterator[int]:
    for line in lines:
        stripped = line.strip()
        if not stripped:
            continue
        if not stripped.lstrip("-").isdigit():
            continue
        yield int(stripped)


sample = iter(["10", "", " -7 ", "abc", "0", "42"])
print(list(read_clean_numbers(sample)))
# [10, -7, 0, 42]

A generator expression couldn’t carry this much logic. A generator function can.

yield from — delegating to another generator

When a generator wants to “include” all the values from another iterable, use yield from:

def first_half(items: list[int]) -> Iterator[int]:
    half = len(items) // 2
    yield from items[:half]


def second_half(items: list[int]) -> Iterator[int]:
    half = len(items) // 2
    yield from items[half:]


def all_in_order(items: list[int]) -> Iterator[int]:
    yield from first_half(items)
    yield from second_half(items)


print(list(all_in_order([1, 2, 3, 4, 5, 6])))   # [1, 2, 3, 4, 5, 6]

Without yield from, you’d have to write for x in ...: yield x. Same result, more typing.

A common ML pattern — batching

Generators are perfect for splitting large data into batches:

def batched(items: list[int], size: int) -> Iterator[list[int]]:
    for i in range(0, len(items), size):
        yield items[i:i + size]


data: list[int] = list(range(1, 11))
for batch in batched(data, 3):
    print(batch)
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]

Python 3.12 actually adds this as a built-in (itertools.batched), but writing your own takes three lines.

Summary of Section 8

You can now:

  • Tell an iterable from an iterator
  • Walk an iterator one step at a time with iter() and next()
  • Write generator expressions for short lazy pipelines
  • Use yield to write generator functions for more complex production

Iterators and generators are the foundation of how Python streams data. They make data pipelines clean, memory-efficient, and easy to compose.

What’s next

Section 9: Exception Handling — how to deal with the things that go wrong while a program runs.

Toggle theme (T)