If a class is mostly “hold some fields together”, writing __init__, __repr__, and __eq__ by hand gets tedious. Data classes generate all that for you from type-annotated attributes.

Data classes are one of the best modern-Python features. Once you start using them, you’ll wonder how you wrote Python without.

A regular class — lots of boilerplate

class User:
    def __init__(self, name: str, age: int, email: str) -> None:
        self.name = name
        self.age = age
        self.email = email

    def __repr__(self) -> str:
        return f"User(name={self.name!r}, age={self.age}, email={self.email!r})"

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, User):
            return NotImplemented
        return (
            self.name == other.name
            and self.age == other.age
            and self.email == other.email
        )

The same thing as a data class:

from dataclasses import dataclass


@dataclass
class User:
    name: str
    age: int
    email: str

That’s it. The @dataclass decorator generates __init__, __repr__, and __eq__ automatically.

Usage is identical:

u1 = User("Manikandan", 30, "[email protected]")
u2 = User("Manikandan", 30, "[email protected]")

print(u1)              # User(name='Manikandan', age=30, email='[email protected]')
print(u1 == u2)         # True

Default values

Defaults work the same as regular function arguments:

@dataclass
class Settings:
    host: str = "localhost"
    port: int = 8080
    debug: bool = False


s = Settings()
print(s)   # Settings(host='localhost', port=8080, debug=False)

s = Settings(host="api.example.com")
print(s)   # Settings(host='api.example.com', port=8080, debug=False)

Mutable defaults — use field(default_factory=...)

The mutable-default trap applies here too:

from dataclasses import dataclass, field


@dataclass
class Team:
    name: str
    members: list[str] = field(default_factory=list)

field(default_factory=list) means “every new instance gets a fresh empty list”. Use this for any mutable default — list, dict, set.

a = Team("Alpha")
b = Team("Beta")
a.members.append("Alice")
print(b.members)   # [] — not affected

Frozen — make instances immutable

Pass frozen=True to make a data class behave like a tuple — you can’t change its fields after construction:

@dataclass(frozen=True)
class Point:
    x: float
    y: float


p = Point(1.0, 2.0)
p.x = 5.0   # FrozenInstanceError

Frozen data classes are also hashable, so they can be used as set members or dict keys.

For values that represent “a thing” (a coordinate, an event, a config snapshot), frozen=True is the right default.

Comparing data classes

@dataclass generates __eq__ automatically. For ordering (<, >), pass order=True:

@dataclass(order=True)
class Person:
    age: int
    name: str


people = [Person(30, "Alice"), Person(25, "Bob"), Person(30, "Carol")]
print(sorted(people))
# [Person(age=25, name='Bob'), Person(age=30, name='Alice'), Person(age=30, name='Carol')]

Comparison goes field by field, in declaration order. Put the most important sort key first.

Methods on data classes

You can still add methods. Type-annotated fields become the constructor; everything else is a method:

@dataclass
class Rectangle:
    width: float
    height: float

    def area(self) -> float:
        return self.width * self.height

    def is_square(self) -> bool:
        return self.width == self.height


r = Rectangle(3.0, 3.0)
print(r.area())        # 9.0
print(r.is_square())   # True

A real-world example — config

from dataclasses import dataclass


@dataclass(frozen=True)
class DatabaseConfig:
    host: str
    port: int = 5432
    user: str = "postgres"
    timeout_seconds: float = 30.0


@dataclass(frozen=True)
class AppConfig:
    debug: bool
    database: DatabaseConfig
    log_level: str = "INFO"


config = AppConfig(
    debug=True,
    database=DatabaseConfig(host="localhost"),
)

print(config)

This is a clean way to structure configuration. Frozen, validated by types, no boilerplate. Compare to a nested dictionary — you lose autocomplete, type checking, and the readable repr.

Pydantic — when data classes aren’t enough

For data that comes from outside (JSON from an API, user input), you usually want validation — making sure types and values are actually correct, not just declared. That’s what Pydantic does. It looks like a data class but validates every field at construction time.

Pydantic is a third-party library — install with uv add pydantic. We won’t cover it in this course, but if you’re heading into web development or any work involving external data, it’s the natural next step from data classes.

Summary of Section 12

You can now:

  • Define classes with __init__ and methods
  • Distinguish instance attributes from class attributes
  • Use @property for computed attributes and validation
  • Inherit from a parent class with super()
  • Use @dataclass to skip boilerplate

You have the OOP basics. Real Python code uses dataclasses everywhere for data-holding objects, and reaches for full classes only when behaviour gets complex.

What’s next

Section 13: Type System and Protocols — running a real type checker over your code and writing structural types.

Toggle theme (T)