Skip to content

Data Modeling Guidelines

Logical Model

  • Align the model with domain language (DDD): aggregates, entities, value objects.
  • An aggregate is a consistency boundary — enforce invariants within it, never across aggregates.
  • Value objects are immutable and identified by their value, not an ID.
  • Entities have identity that persists across state changes.
  • Model behavior, not just data: place domain logic inside aggregates, not in services.
  • Aggregates are pure domain objects — no I/O, no persistence, no external calls. Repositories (infra layer) handle load/save.
  • Avoid anemic models (plain data bags with no behavior).
  • Name types, fields, and relations using the ubiquitous language of the domain — no technical abbreviations.

Physical Model

Relational (SQL)

  • Every table has a surrogate primary key (id); use UUIDs for distributed systems, auto-increment otherwise.
  • Name tables as singular nouns (order, customer), columns as snake_case.
  • Add a created_at / updated_at timestamp to every mutable table.
  • Normalize to 3NF by default; denormalize only when read performance requires it and you can justify it.
  • Declare foreign keys and let the DB enforce referential integrity.
  • Index every foreign key and every column used in WHERE, ORDER BY, or JOIN predicates.
  • Avoid NULL where possible — NULL means "unknown", not "empty" or "zero".
  • Never store multiple values in a single column (no comma-separated lists).
  • Migrations are forward-only; never modify existing migration files.

Document Stores

  • Embed data that is always read together and only written by one aggregate.
  • Reference (by ID) data that is shared across aggregates or grows unbounded.
  • Design documents around read patterns, not write patterns.
  • Keep documents small — avoid unbounded arrays inside a document.
  • Include a schema_version field for forward-compatible migrations.

Key/Value Stores

  • Design keys to be self-describing and hierarchical: <namespace>:<entity>:<id> (e.g., session:user:42).
  • Set a TTL for all ephemeral data (sessions, caches, rate-limit counters).
  • Never store relational data in a key/value store; use it for caching, sessions, and simple counters.
  • Document the key schema in the codebase alongside the access code.