If a class is mostly “hold some fields together”, writing __init__, __repr__, and __eq__ by hand gets tedious. Data classes generate all that for you from type-annotated attributes.
Data classes are one of the best modern-Python features. Once you start using them, you’ll wonder how you wrote Python without.
A regular class — lots of boilerplate
class User:
def __init__(self, name: str, age: int, email: str) -> None:
self.name = name
self.age = age
self.email = email
def __repr__(self) -> str:
return f"User(name={self.name!r}, age={self.age}, email={self.email!r})"
def __eq__(self, other: object) -> bool:
if not isinstance(other, User):
return NotImplemented
return (
self.name == other.name
and self.age == other.age
and self.email == other.email
)
The same thing as a data class:
from dataclasses import dataclass
@dataclass
class User:
name: str
age: int
email: str
That’s it. The @dataclass decorator generates __init__, __repr__, and __eq__ automatically.
Usage is identical:
u1 = User("Manikandan", 30, "[email protected]")
u2 = User("Manikandan", 30, "[email protected]")
print(u1) # User(name='Manikandan', age=30, email='[email protected]')
print(u1 == u2) # True
Default values
Defaults work the same as regular function arguments:
@dataclass
class Settings:
host: str = "localhost"
port: int = 8080
debug: bool = False
s = Settings()
print(s) # Settings(host='localhost', port=8080, debug=False)
s = Settings(host="api.example.com")
print(s) # Settings(host='api.example.com', port=8080, debug=False)
Mutable defaults — use field(default_factory=...)
The mutable-default trap applies here too:
from dataclasses import dataclass, field
@dataclass
class Team:
name: str
members: list[str] = field(default_factory=list)
field(default_factory=list) means “every new instance gets a fresh empty list”. Use this for any mutable default — list, dict, set.
a = Team("Alpha")
b = Team("Beta")
a.members.append("Alice")
print(b.members) # [] — not affected
Frozen — make instances immutable
Pass frozen=True to make a data class behave like a tuple — you can’t change its fields after construction:
@dataclass(frozen=True)
class Point:
x: float
y: float
p = Point(1.0, 2.0)
p.x = 5.0 # FrozenInstanceError
Frozen data classes are also hashable, so they can be used as set members or dict keys.
For values that represent “a thing” (a coordinate, an event, a config snapshot), frozen=True is the right default.
Comparing data classes
@dataclass generates __eq__ automatically. For ordering (<, >), pass order=True:
@dataclass(order=True)
class Person:
age: int
name: str
people = [Person(30, "Alice"), Person(25, "Bob"), Person(30, "Carol")]
print(sorted(people))
# [Person(age=25, name='Bob'), Person(age=30, name='Alice'), Person(age=30, name='Carol')]
Comparison goes field by field, in declaration order. Put the most important sort key first.
Methods on data classes
You can still add methods. Type-annotated fields become the constructor; everything else is a method:
@dataclass
class Rectangle:
width: float
height: float
def area(self) -> float:
return self.width * self.height
def is_square(self) -> bool:
return self.width == self.height
r = Rectangle(3.0, 3.0)
print(r.area()) # 9.0
print(r.is_square()) # True
A real-world example — config
from dataclasses import dataclass
@dataclass(frozen=True)
class DatabaseConfig:
host: str
port: int = 5432
user: str = "postgres"
timeout_seconds: float = 30.0
@dataclass(frozen=True)
class AppConfig:
debug: bool
database: DatabaseConfig
log_level: str = "INFO"
config = AppConfig(
debug=True,
database=DatabaseConfig(host="localhost"),
)
print(config)
This is a clean way to structure configuration. Frozen, validated by types, no boilerplate. Compare to a nested dictionary — you lose autocomplete, type checking, and the readable repr.
Pydantic — when data classes aren’t enough
For data that comes from outside (JSON from an API, user input), you usually want validation — making sure types and values are actually correct, not just declared. That’s what Pydantic does. It looks like a data class but validates every field at construction time.
Pydantic is a third-party library — install with uv add pydantic. We won’t cover it in this course, but if you’re heading into web development or any work involving external data, it’s the natural next step from data classes.
Summary of Section 12
You can now:
- Define classes with
__init__and methods - Distinguish instance attributes from class attributes
- Use
@propertyfor computed attributes and validation - Inherit from a parent class with
super() - Use
@dataclassto skip boilerplate
You have the OOP basics. Real Python code uses dataclasses everywhere for data-holding objects, and reaches for full classes only when behaviour gets complex.
What’s next
Section 13: Type System and Protocols — running a real type checker over your code and writing structural types.