What is the repository pattern

- Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.

Repository pattern best practices

- Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.

How to abstract data access layer

- Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.

Generic repository vs specific repository

- Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.

Repository Pattern: Implementation Guide

How do I implement the repository pattern?

TL;DR

Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.
Key tool/command: interface UserRepository { findById(id): User; save(user): void } + concrete implementation per data store.
Watch out for: Generic repositories that expose ORM internals (IQueryable, QuerySet) -- this defeats the entire purpose of the abstraction.
Works with: Any language/framework. Most valuable with DDD, hexagonal architecture, and projects requiring extensive unit testing.

Constraints

One repository per aggregate root -- never create a repository for every database table. [src1]
Repository interfaces belong in the domain layer; implementations belong in the infrastructure layer. [src3]
Never return ORM-specific types (IQueryable, QuerySet, Session) from repository methods -- always return materialized domain objects. [src4]
Repository methods must be persistence-agnostic in their signatures -- no SQL fragments, no ORM filter objects in parameters.
Keep repositories focused on collection-like operations (add, remove, find) -- no business logic, no email sending, no cross-aggregate queries. [src2]

Quick Reference

Variant	Data Access	Unit of Work	Testability	Complexity	Best For
Basic Repository	Interface + one impl per aggregate	Manual or none	High (mock interface)	Low	Small-medium projects
Generic Repository	Base class with CRUD generics	Manual or none	Medium (often leaks ORM)	Low-Medium	Boilerplate reduction
Repository + Unit of Work	Repository delegates save to UoW	Explicit UoW class	High	Medium	Transaction coordination
Specification Pattern	Repository accepts Spec objects	Compatible with UoW	High	Medium-High	Complex query composition
CQRS Split	Repository for writes, thin reads bypass	Write-side UoW	High (write side)	High	Read-heavy, complex domains
Query Objects	Separate class per query	None needed	High	Medium	Many distinct read paths
Repository + Mediator	Repository behind command/query handlers	Handler-scoped UoW	High	High	Event-driven architectures
Active Record (anti-pattern)	Entity IS the repository	Built into entity	Low (tightly coupled)	Low	Avoid in DDD contexts

Decision Tree

START
├── Is your domain logic complex (>10 business rules, aggregates)?
│   ├── YES → Use specific repositories per aggregate root
│   │   ├── Need complex query composition?
│   │   │   ├── YES → Add Specification pattern
│   │   │   └── NO → Basic repository is sufficient
│   │   ├── Need separate read/write optimization?
│   │   │   ├── YES → Consider CQRS (repository for writes, direct queries for reads)
│   │   │   └── NO → Standard repository
│   │   └── Multiple data sources in one transaction?
│   │       ├── YES → Add Unit of Work pattern
│   │       └── NO → Repository handles its own persistence
│   └── NO ↓
├── Is testability with mock data stores a primary concern?
│   ├── YES → Use repository interfaces even for simple CRUD
│   └── NO ↓
├── Is this a simple CRUD app with <5 entities?
│   ├── YES → Skip repository pattern, use ORM directly
│   └── NO ↓
└── DEFAULT → Start with basic specific repositories, add complexity only when needed

Step-by-Step Guide

1. Define the repository interface in the domain layer

The interface describes what the domain needs from persistence, not how persistence works. Keep method signatures in terms of domain objects only. [src1]

// domain/repositories/UserRepository.ts
export interface UserRepository {
  findById(id: string): Promise<User | null>;
  findByEmail(email: string): Promise<User | null>;
  save(user: User): Promise<void>;
  delete(id: string): Promise<void>;
}

Verify: The interface imports only domain types -- no ORM, no database driver, no SQL.

2. Create the concrete implementation in the infrastructure layer

The implementation translates domain operations into database calls. All ORM/SQL details live here. [src7]

// infrastructure/repositories/PostgresUserRepository.ts
import { Pool } from 'pg';
import { User } from '../../domain/entities/User';
import { UserRepository } from '../../domain/repositories/UserRepository';

export class PostgresUserRepository implements UserRepository {
  constructor(private pool: Pool) {}

  async findById(id: string): Promise<User | null> {
    const { rows } = await this.pool.query(
      'SELECT * FROM users WHERE id = $1', [id]
    );
    return rows[0] ? this.toDomain(rows[0]) : null;
  }

  async save(user: User): Promise<void> {
    await this.pool.query(
      `INSERT INTO users (id, email, name) VALUES ($1, $2, $3)
       ON CONFLICT (id) DO UPDATE SET email = $2, name = $3`,
      [user.id, user.email, user.name]
    );
  }

  private toDomain(row: any): User {
    return new User(row.id, row.email, row.name);
  }
}

Verify: The implementation class imports the interface and implements every method. Domain layer has zero dependency on this file.

3. Wire up via dependency injection

Inject the repository interface into domain services. The composition root (main/startup) decides which implementation to use. [src1]

// application/services/UserService.ts
export class UserService {
  constructor(private userRepo: UserRepository) {} // interface, not impl

  async registerUser(email: string, name: string): Promise<User> {
    const existing = await this.userRepo.findByEmail(email);
    if (existing) throw new Error('Email already registered');
    const user = User.create(email, name);
    await this.userRepo.save(user);
    return user;
  }
}

Verify: UserService constructor accepts the interface type, not PostgresUserRepository.

4. Create a test double for unit testing

Because the service depends on an interface, swap in a fake for tests with zero database setup. [src6]

// tests/FakeUserRepository.ts
export class FakeUserRepository implements UserRepository {
  private users: Map<string, User> = new Map();

  async findById(id: string): Promise<User | null> {
    return this.users.get(id) || null;
  }
  async findByEmail(email: string): Promise<User | null> {
    return [...this.users.values()].find(u => u.email === email) || null;
  }
  async save(user: User): Promise<void> {
    this.users.set(user.id, user);
  }
  async delete(id: string): Promise<void> {
    this.users.delete(id);
  }
}

Verify: Tests run without any database connection: const repo = new FakeUserRepository(); const svc = new UserService(repo);

Code Examples

TypeScript: Repository with Prisma ORM

// Interface -- domain layer (no ORM imports)
interface OrderRepository {
  findById(id: string): Promise<Order | null>;
  findByCustomer(customerId: string): Promise<Order[]>;
  save(order: Order): Promise<void>;
}

// Implementation -- infrastructure layer
class PrismaOrderRepository implements OrderRepository {
  constructor(private prisma: PrismaClient) {}

  async findById(id: string): Promise<Order | null> {
    const row = await this.prisma.order.findUnique({
      where: { id }, include: { items: true }
    });
    return row ? OrderMapper.toDomain(row) : null;
  }

  async findByCustomer(customerId: string): Promise<Order[]> {
    const rows = await this.prisma.order.findMany({
      where: { customerId }, include: { items: true }
    });
    return rows.map(OrderMapper.toDomain);
  }

  async save(order: Order): Promise<void> {
    await this.prisma.order.upsert({
      where: { id: order.id },
      create: OrderMapper.toPersistence(order),
      update: OrderMapper.toPersistence(order),
    });
  }
}

Python: Repository with SQLAlchemy

# domain/repositories.py -- abstract interface
from abc import ABC, abstractmethod
from domain.entities import User

class UserRepository(ABC):
    @abstractmethod
    def find_by_id(self, user_id: str) -> User | None: ...

    @abstractmethod
    def find_by_email(self, email: str) -> User | None: ...

    @abstractmethod
    def save(self, user: User) -> None: ...

# infrastructure/sql_user_repository.py
from sqlalchemy.orm import Session

class SqlUserRepository(UserRepository):
    def __init__(self, session: Session):
        self._session = session

    def find_by_id(self, user_id: str) -> User | None:
        row = self._session.get(UserModel, user_id)
        return self._to_domain(row) if row else None

    def find_by_email(self, email: str) -> User | None:
        row = self._session.query(UserModel).filter_by(
            email=email
        ).first()
        return self._to_domain(row) if row else None

    def save(self, user: User) -> None:
        model = self._to_model(user)
        self._session.merge(model)
        self._session.flush()

Java: Repository with Spring Data JPA

// domain/repository/OrderRepository.java -- domain interface
public interface OrderRepository {
    Optional<Order> findById(String id);
    List<Order> findByCustomerId(String customerId);
    void save(Order order);
    void delete(String id);
}

// infrastructure/persistence/JpaOrderRepository.java
@Repository
public class JpaOrderRepository implements OrderRepository {
    private final SpringDataOrderRepo springRepo;
    private final OrderMapper mapper;

    public JpaOrderRepository(SpringDataOrderRepo springRepo,
                               OrderMapper mapper) {
        this.springRepo = springRepo;
        this.mapper = mapper;
    }

    @Override
    public Optional<Order> findById(String id) {
        return springRepo.findById(id).map(mapper::toDomain);
    }

    @Override
    public void save(Order order) {
        springRepo.save(mapper.toEntity(order));
    }
}

Go: Repository with sqlc

// domain/repository.go -- interface
type UserRepository interface {
    FindByID(ctx context.Context, id string) (*User, error)
    FindByEmail(ctx context.Context, email string) (*User, error)
    Save(ctx context.Context, user *User) error
    Delete(ctx context.Context, id string) error
}

// infrastructure/postgres_user_repo.go
type PostgresUserRepo struct {
    queries *sqlcgen.Queries  // generated by sqlc
}

func NewPostgresUserRepo(db *sql.DB) *PostgresUserRepo {
    return &PostgresUserRepo{queries: sqlcgen.New(db)}
}

func (r *PostgresUserRepo) FindByID(
    ctx context.Context, id string,
) (*User, error) {
    row, err := r.queries.GetUserByID(ctx, id)
    if err != nil {
        if errors.Is(err, sql.ErrNoRows) {
            return nil, nil
        }
        return nil, fmt.Errorf("find user by id: %w", err)
    }
    return toDomainUser(row), nil
}

func (r *PostgresUserRepo) Save(
    ctx context.Context, user *User,
) error {
    return r.queries.UpsertUser(ctx, sqlcgen.UpsertUserParams{
        ID: user.ID, Email: user.Email, Name: user.Name,
    })
}

Anti-Patterns

Wrong: Generic repository exposing IQueryable/QuerySet

// BAD -- leaks ORM query builder through the interface
public interface IRepository<T> {
    IQueryable<T> GetAll();  // Callers build arbitrary queries
    T GetById(int id);
    void Add(T entity);
}
// Service code now contains ORM-specific LINQ:
var users = repo.GetAll()
    .Where(u => u.IsActive)
    .Include(u => u.Orders)  // EF-specific!
    .ToListAsync();

Correct: Domain-specific methods with materialized results

// GOOD -- interface exposes domain operations only
public interface IUserRepository {
    Task<User?> FindByIdAsync(int id);
    Task<IReadOnlyList<User>> FindActiveUsersAsync();
    Task SaveAsync(User user);
}
// Implementation handles all ORM details internally

Wrong: Repository wrapping another repository

// BAD -- repository delegates to another repository with no added value
class CachedUserRepository implements UserRepository {
  constructor(
    private innerRepo: UserRepository,
    private otherRepo: UserRepository
  ) {}
  async findById(id: string) {
    return this.innerRepo.findById(id); // Just passes through
  }
}

Correct: Decorator with caching adds real value

// GOOD -- caching decorator adds genuine cross-cutting concern
class CachedUserRepository implements UserRepository {
  constructor(private delegate: UserRepository, private cache: Cache) {}
  async findById(id: string): Promise<User | null> {
    const cached = await this.cache.get(`user:${id}`);
    if (cached) return cached;
    const user = await this.delegate.findById(id);
    if (user) await this.cache.set(`user:${id}`, user, 300);
    return user;
  }
}

Wrong: Fat repository with business logic

# BAD -- repository validates, calculates, and sends emails
class OrderRepository:
    def place_order(self, order):
        if order.total < 0:
            raise ValueError("Invalid total")  # Business rule!
        order.tax = order.total * 0.2           # Calculation!
        self.session.add(order)
        self.session.commit()
        send_confirmation_email(order)          # Side effect!

Correct: Repository only persists, service handles logic

# GOOD -- repository does one thing: persist
class SqlOrderRepository:
    def save(self, order: Order) -> None:
        model = self._to_model(order)
        self._session.merge(model)
        self._session.flush()

# Business logic lives in the domain/application layer
class OrderService:
    def place_order(self, order: Order) -> None:
        order.validate()               # Domain logic
        order.calculate_tax()           # Domain logic
        self.order_repo.save(order)     # Persistence only
        self.email_service.send(order)  # Separate concern

Common Pitfalls

One repository per table instead of per aggregate: Creates tight coupling to the database schema and breaks aggregate boundaries. Fix: identify aggregate roots in your domain model and create one repository per root. [src1]
Returning ORM entities instead of domain objects: Services become coupled to the ORM. If you switch databases, every service changes. Fix: add a toDomain() / toModel() mapper in the repository implementation. [src6]
Generic repository with unused methods: IRepository<T> forces Delete() on entities that should never be deleted, violating Interface Segregation. Fix: define specific interfaces per aggregate with only the methods that aggregate needs. [src4]
Putting Unit of Work logic inside the repository: Repositories should not call commit() / SaveChanges(). Fix: let the application service or a separate Unit of Work coordinate transactions. [src3]
Testing against the database through the repository: The whole point is to enable in-memory fakes. Fix: always write domain logic tests with fake/stub repositories, reserve integration tests for the real implementation. [src6]
Repository becoming a query service: Adding dozens of findByX methods for reporting queries. Fix: use CQRS -- repositories for writes, lightweight query objects or direct SQL for reads. [src2]

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Domain has complex business rules and aggregates	Simple CRUD with <5 entities and no domain logic	ORM directly (Active Record or Data Mapper)
You need to unit test domain logic without a database	Rapid prototyping or throwaway code	Direct database queries
Multiple data source backends are realistic (SQL, NoSQL, API)	Single database that will never change	ORM repository (e.g., Spring Data, Django ORM)
Team follows DDD or hexagonal architecture	Read-heavy analytics/reporting queries	Query objects or CQRS read side
Aggregate boundaries need strict enforcement	Microservice with a single entity and no joins	Thin data access functions
You need a caching decorator or audit logging layer	Framework already provides testable abstractions	Framework-provided test utilities

Important Caveats

The repository pattern adds an abstraction layer -- in simple apps this is unnecessary complexity. If your ORM already provides a clean, testable interface, adding repositories may be over-engineering. [src4]
Generic repositories (IRepository<T>) almost always become a leaky abstraction. Prefer specific repository interfaces per aggregate root. [src4]
In CQRS architectures, repositories are typically only used on the write/command side. The read/query side bypasses repositories entirely for performance. [src1]
The pattern originated in Eric Evans' Domain-Driven Design (2003). Its value is highest when your domain model diverges significantly from your database schema.
When using ORMs like Entity Framework or SQLAlchemy, the ORM session/DbContext already implements a Unit of Work. Adding another UoW layer on top can create confusion about who owns the transaction.