A set is an unordered collection of unique items. Two key properties:
- No duplicates. Adding the same item twice has no effect.
- No order. Sets don’t keep items in any particular sequence.
Use a set when you care about membership (“is this in the collection?”) or want to drop duplicates.
Creating a set
tags: set[str] = {"python", "ml", "ai"}
numbers: set[int] = {1, 2, 3, 4, 5}
# from another collection — duplicates are dropped
unique: set[int] = set([1, 2, 2, 3, 3, 3])
print(unique) # {1, 2, 3}
A set:
- Uses curly braces
{ }. - Items are separated by commas.
- Cannot have duplicates.
- Items must be hashable (numbers, strings, tuples — not lists or dicts).
The empty set is
set(), not{}. The{}syntax creates an empty dictionary (next lesson).
Adding and removing
tags: set[str] = {"python", "ml"}
tags.add("ai")
print(tags) # {'python', 'ml', 'ai'} (order may vary)
tags.add("python") # already there — no change
print(tags) # {'python', 'ml', 'ai'}
tags.remove("ml") # raises KeyError if not found
tags.discard("nope") # safe — does nothing if not found
popped = tags.pop() # removes an arbitrary item
print(popped, tags)
Sets are unordered
You can’t index a set:
tags[0] # TypeError: 'set' object is not subscriptable
You also can’t slice. If you need order, use a list instead.
Membership — what sets are made for
The whole point of a set is fast “is this in here?” checks:
tags: set[str] = {"python", "ml", "ai"}
print("python" in tags) # True
print("rust" in tags) # False
For a list of N items, in has to scan up to all N. For a set, in takes the same time no matter how big the set is. The difference matters at scale — checking membership against a set of one million items is roughly as fast as checking against a set of ten.
Removing duplicates
The classic use case:
words: list[str] = ["python", "ml", "python", "ai", "ml", "ai"]
unique: list[str] = list(set(words))
print(unique) # ['python', 'ml', 'ai'] (order may vary)
Two notes:
- The order is not preserved by the set.
- If you need to preserve order, use
list(dict.fromkeys(words))— a Python idiom that uses dictionaries (which do keep insertion order).
Set operations
Sets support proper mathematical operations:
a: set[int] = {1, 2, 3, 4}
b: set[int] = {3, 4, 5, 6}
print(a | b) # union {1, 2, 3, 4, 5, 6}
print(a & b) # intersection {3, 4}
print(a - b) # difference {1, 2}
print(a ^ b) # symmetric difference {1, 2, 5, 6}
The same operations as methods:
print(a.union(b)) # same as a | b
print(a.intersection(b)) # same as a & b
print(a.difference(b)) # same as a - b
print(a.symmetric_difference(b)) # same as a ^ b
The operator form is shorter and reads almost like maths.
Subset and superset
small: set[int] = {1, 2}
big: set[int] = {1, 2, 3, 4}
print(small <= big) # True — small is a subset of big
print(big >= small) # True — big is a superset of small
print(small < big) # True — strict subset (small != big)
print(small.isdisjoint({99, 100})) # True — no items in common
Looping over a set
for tag in {"python", "ml", "ai"}:
print(tag)
The order isn’t guaranteed. If you need consistent order, sort first:
for tag in sorted({"python", "ml", "ai"}):
print(tag)
Frozen sets
If you need an immutable set (one that can’t be changed), use frozenset:
permissions: frozenset[str] = frozenset({"read", "write"})
permissions.add("delete") # AttributeError
You’ll rarely need this in beginner code. It’s useful when you want to use a set as a key in a dictionary (regular sets can’t be keys because they’re mutable).
A practical example — common tags
Find which tags appear in two articles:
article_a: set[str] = {"python", "ai", "tutorial", "beginner"}
article_b: set[str] = {"python", "ml", "advanced", "tutorial"}
common: set[str] = article_a & article_b
print(common) # {'python', 'tutorial'}
unique_to_a: set[str] = article_a - article_b
print(unique_to_a) # {'ai', 'beginner'}
This is exactly the kind of work sets are made for.
What’s next
Sets handle “is it in here” and “what’s in both?”. Next, the most important data structure in Python (and in data science): dictionaries.