collections - Python Programming - Fundamentals

The built-in list, dict, set, and tuple cover most needs. For a few specific cases, the collections module offers cleaner alternatives.

Counter — count things

Counter is a dictionary subclass that counts how often each item appears:

from collections import Counter

words: list[str] = ["python", "ml", "python", "ai", "ml", "ai", "python"]
counts: Counter[str] = Counter(words)
print(counts)
# Counter({'python': 3, 'ml': 2, 'ai': 2})

It works on any iterable — lists, strings, generators:

text: str = "mississippi"
print(Counter(text))
# Counter({'i': 4, 's': 4, 'p': 2, 'm': 1})

Useful methods:

counts.most_common(2)        # [('python', 3), ('ml', 2)]
counts["go"]                  # 0   (missing keys return 0, not KeyError)
counts.update(["python"])     # add more counts

Counter replaces the manual “if key in dict: dict[key] += 1” pattern with one line.

defaultdict — auto-create missing keys

A regular dict raises KeyError for missing keys. defaultdict creates a default value on access:

from collections import defaultdict

# group words by their length
words: list[str] = ["go", "python", "ai", "kotlin", "ml"]
by_length: dict[int, list[str]] = defaultdict(list)

for word in words:
    by_length[len(word)].append(word)

print(dict(by_length))
# {2: ['go', 'ai', 'ml'], 6: ['python', 'kotlin']}

The argument to defaultdict is a factory — list makes empty lists, int makes zeros, set makes empty sets:

counts = defaultdict(int)
for word in words:
    counts[word] += 1

Without defaultdict you’d have to write if word not in counts: counts[word] = 0 before each increment.

namedtuple — tuples with field names

A regular tuple has only positions. namedtuple adds names:

from collections import namedtuple

Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p)        # Point(x=3, y=4)
print(p.x)      # 3
print(p.y)      # 4

It behaves like a tuple (immutable, indexable, unpackable) but with friendly attribute access.

Modern alternative: typing.NamedTuple with type hints:

from typing import NamedTuple


class Point(NamedTuple):
    x: float
    y: float


p = Point(3, 4)
print(p.x, p.y)

For new code, prefer typing.NamedTuple or a frozen @dataclass — both are typed. Use collections.namedtuple only when you find it in older code.

deque — fast appends at both ends

A deque (pronounced “deck”) is like a list but with fast inserts and removals at both ends. Lists are slow for insert(0, ...) and pop(0); deques are constant time.

from collections import deque

q: deque[int] = deque([1, 2, 3])
q.append(4)              # right end
q.appendleft(0)          # left end
print(q)                 # deque([0, 1, 2, 3, 4])

print(q.pop())           # 4   — right end
print(q.popleft())       # 0   — left end

Use a deque when:

You need a FIFO queue
You need a sliding window (with maxlen)
You’re doing breadth-first search

A sliding window of the last 5 items:

window: deque[int] = deque(maxlen=5)
for n in range(10):
    window.append(n)
    print(list(window))

[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
...

Once maxlen is reached, appending one drops the oldest. Very useful for moving averages and recent-event tracking.

OrderedDict — ordered dict (mostly obsolete now)

Before Python 3.7, regular dicts didn’t preserve insertion order. OrderedDict did. Since 3.7, dicts preserve order natively — so OrderedDict is rarely needed.

The one feature OrderedDict still has over a plain dict:

from collections import OrderedDict

d = OrderedDict([("a", 1), ("b", 2)])
d.move_to_end("a")        # move "a" to the end
print(d)                  # OrderedDict([('b', 2), ('a', 1)])

If you don’t need this, use a regular dict.

ChainMap — combine multiple dicts

ChainMap lets you look through several dicts in order, returning the first match:

from collections import ChainMap

defaults: dict[str, str] = {"theme": "light", "lang": "en"}
user_prefs: dict[str, str] = {"theme": "dark"}

settings = ChainMap(user_prefs, defaults)
print(settings["theme"])    # 'dark'  (from user_prefs)
print(settings["lang"])     # 'en'    (from defaults)

Niche but useful for layered configuration (CLI flags → env vars → file → defaults).

What’s next

The collections module rounds out Python’s data structure offering. Next — a deeper look at pathlib, beyond what Section 10 covered.