Repository Pattern: Implementation Guide
How do I implement the repository pattern?
TL;DR
- Bottom line: Abstract data access behind a clean interface so domain logic stays independent of the database, enabling testability and swappable persistence.
- Key tool/command:
interface UserRepository { findById(id): User; save(user): void }+ concrete implementation per data store. - Watch out for: Generic repositories that expose ORM internals (IQueryable, QuerySet) -- this defeats the entire purpose of the abstraction.
- Works with: Any language/framework. Most valuable with DDD, hexagonal architecture, and projects requiring extensive unit testing.
Constraints
- One repository per aggregate root -- never create a repository for every database table. [src1]
- Repository interfaces belong in the domain layer; implementations belong in the infrastructure layer. [src3]
- Never return ORM-specific types (IQueryable, QuerySet, Session) from repository methods -- always return materialized domain objects. [src4]
- Repository methods must be persistence-agnostic in their signatures -- no SQL fragments, no ORM filter objects in parameters.
- Keep repositories focused on collection-like operations (add, remove, find) -- no business logic, no email sending, no cross-aggregate queries. [src2]
Quick Reference
| Variant | Data Access | Unit of Work | Testability | Complexity | Best For |
|---|---|---|---|---|---|
| Basic Repository | Interface + one impl per aggregate | Manual or none | High (mock interface) | Low | Small-medium projects |
| Generic Repository | Base class with CRUD generics | Manual or none | Medium (often leaks ORM) | Low-Medium | Boilerplate reduction |
| Repository + Unit of Work | Repository delegates save to UoW | Explicit UoW class | High | Medium | Transaction coordination |
| Specification Pattern | Repository accepts Spec objects | Compatible with UoW | High | Medium-High | Complex query composition |
| CQRS Split | Repository for writes, thin reads bypass | Write-side UoW | High (write side) | High | Read-heavy, complex domains |
| Query Objects | Separate class per query | None needed | High | Medium | Many distinct read paths |
| Repository + Mediator | Repository behind command/query handlers | Handler-scoped UoW | High | High | Event-driven architectures |
| Active Record (anti-pattern) | Entity IS the repository | Built into entity | Low (tightly coupled) | Low | Avoid in DDD contexts |
Decision Tree
START
├── Is your domain logic complex (>10 business rules, aggregates)?
│ ├── YES → Use specific repositories per aggregate root
│ │ ├── Need complex query composition?
│ │ │ ├── YES → Add Specification pattern
│ │ │ └── NO → Basic repository is sufficient
│ │ ├── Need separate read/write optimization?
│ │ │ ├── YES → Consider CQRS (repository for writes, direct queries for reads)
│ │ │ └── NO → Standard repository
│ │ └── Multiple data sources in one transaction?
│ │ ├── YES → Add Unit of Work pattern
│ │ └── NO → Repository handles its own persistence
│ └── NO ↓
├── Is testability with mock data stores a primary concern?
│ ├── YES → Use repository interfaces even for simple CRUD
│ └── NO ↓
├── Is this a simple CRUD app with <5 entities?
│ ├── YES → Skip repository pattern, use ORM directly
│ └── NO ↓
└── DEFAULT → Start with basic specific repositories, add complexity only when needed
Step-by-Step Guide
1. Define the repository interface in the domain layer
The interface describes what the domain needs from persistence, not how persistence works. Keep method signatures in terms of domain objects only. [src1]
// domain/repositories/UserRepository.ts
export interface UserRepository {
findById(id: string): Promise<User | null>;
findByEmail(email: string): Promise<User | null>;
save(user: User): Promise<void>;
delete(id: string): Promise<void>;
}
Verify: The interface imports only domain types -- no ORM, no database driver, no SQL.
2. Create the concrete implementation in the infrastructure layer
The implementation translates domain operations into database calls. All ORM/SQL details live here. [src7]
// infrastructure/repositories/PostgresUserRepository.ts
import { Pool } from 'pg';
import { User } from '../../domain/entities/User';
import { UserRepository } from '../../domain/repositories/UserRepository';
export class PostgresUserRepository implements UserRepository {
constructor(private pool: Pool) {}
async findById(id: string): Promise<User | null> {
const { rows } = await this.pool.query(
'SELECT * FROM users WHERE id = $1', [id]
);
return rows[0] ? this.toDomain(rows[0]) : null;
}
async save(user: User): Promise<void> {
await this.pool.query(
`INSERT INTO users (id, email, name) VALUES ($1, $2, $3)
ON CONFLICT (id) DO UPDATE SET email = $2, name = $3`,
[user.id, user.email, user.name]
);
}
private toDomain(row: any): User {
return new User(row.id, row.email, row.name);
}
}
Verify: The implementation class imports the interface and implements every method. Domain layer has zero dependency on this file.
3. Wire up via dependency injection
Inject the repository interface into domain services. The composition root (main/startup) decides which implementation to use. [src1]
// application/services/UserService.ts
export class UserService {
constructor(private userRepo: UserRepository) {} // interface, not impl
async registerUser(email: string, name: string): Promise<User> {
const existing = await this.userRepo.findByEmail(email);
if (existing) throw new Error('Email already registered');
const user = User.create(email, name);
await this.userRepo.save(user);
return user;
}
}
Verify: UserService constructor accepts the interface type, not PostgresUserRepository.
4. Create a test double for unit testing
Because the service depends on an interface, swap in a fake for tests with zero database setup. [src6]
// tests/FakeUserRepository.ts
export class FakeUserRepository implements UserRepository {
private users: Map<string, User> = new Map();
async findById(id: string): Promise<User | null> {
return this.users.get(id) || null;
}
async findByEmail(email: string): Promise<User | null> {
return [...this.users.values()].find(u => u.email === email) || null;
}
async save(user: User): Promise<void> {
this.users.set(user.id, user);
}
async delete(id: string): Promise<void> {
this.users.delete(id);
}
}
Verify: Tests run without any database connection: const repo = new FakeUserRepository(); const svc = new UserService(repo);
Code Examples
TypeScript: Repository with Prisma ORM
// Interface -- domain layer (no ORM imports)
interface OrderRepository {
findById(id: string): Promise<Order | null>;
findByCustomer(customerId: string): Promise<Order[]>;
save(order: Order): Promise<void>;
}
// Implementation -- infrastructure layer
class PrismaOrderRepository implements OrderRepository {
constructor(private prisma: PrismaClient) {}
async findById(id: string): Promise<Order | null> {
const row = await this.prisma.order.findUnique({
where: { id }, include: { items: true }
});
return row ? OrderMapper.toDomain(row) : null;
}
async findByCustomer(customerId: string): Promise<Order[]> {
const rows = await this.prisma.order.findMany({
where: { customerId }, include: { items: true }
});
return rows.map(OrderMapper.toDomain);
}
async save(order: Order): Promise<void> {
await this.prisma.order.upsert({
where: { id: order.id },
create: OrderMapper.toPersistence(order),
update: OrderMapper.toPersistence(order),
});
}
}
Python: Repository with SQLAlchemy
# domain/repositories.py -- abstract interface
from abc import ABC, abstractmethod
from domain.entities import User
class UserRepository(ABC):
@abstractmethod
def find_by_id(self, user_id: str) -> User | None: ...
@abstractmethod
def find_by_email(self, email: str) -> User | None: ...
@abstractmethod
def save(self, user: User) -> None: ...
# infrastructure/sql_user_repository.py
from sqlalchemy.orm import Session
class SqlUserRepository(UserRepository):
def __init__(self, session: Session):
self._session = session
def find_by_id(self, user_id: str) -> User | None:
row = self._session.get(UserModel, user_id)
return self._to_domain(row) if row else None
def find_by_email(self, email: str) -> User | None:
row = self._session.query(UserModel).filter_by(
email=email
).first()
return self._to_domain(row) if row else None
def save(self, user: User) -> None:
model = self._to_model(user)
self._session.merge(model)
self._session.flush()
Java: Repository with Spring Data JPA
// domain/repository/OrderRepository.java -- domain interface
public interface OrderRepository {
Optional<Order> findById(String id);
List<Order> findByCustomerId(String customerId);
void save(Order order);
void delete(String id);
}
// infrastructure/persistence/JpaOrderRepository.java
@Repository
public class JpaOrderRepository implements OrderRepository {
private final SpringDataOrderRepo springRepo;
private final OrderMapper mapper;
public JpaOrderRepository(SpringDataOrderRepo springRepo,
OrderMapper mapper) {
this.springRepo = springRepo;
this.mapper = mapper;
}
@Override
public Optional<Order> findById(String id) {
return springRepo.findById(id).map(mapper::toDomain);
}
@Override
public void save(Order order) {
springRepo.save(mapper.toEntity(order));
}
}
Go: Repository with sqlc
// domain/repository.go -- interface
type UserRepository interface {
FindByID(ctx context.Context, id string) (*User, error)
FindByEmail(ctx context.Context, email string) (*User, error)
Save(ctx context.Context, user *User) error
Delete(ctx context.Context, id string) error
}
// infrastructure/postgres_user_repo.go
type PostgresUserRepo struct {
queries *sqlcgen.Queries // generated by sqlc
}
func NewPostgresUserRepo(db *sql.DB) *PostgresUserRepo {
return &PostgresUserRepo{queries: sqlcgen.New(db)}
}
func (r *PostgresUserRepo) FindByID(
ctx context.Context, id string,
) (*User, error) {
row, err := r.queries.GetUserByID(ctx, id)
if err != nil {
if errors.Is(err, sql.ErrNoRows) {
return nil, nil
}
return nil, fmt.Errorf("find user by id: %w", err)
}
return toDomainUser(row), nil
}
func (r *PostgresUserRepo) Save(
ctx context.Context, user *User,
) error {
return r.queries.UpsertUser(ctx, sqlcgen.UpsertUserParams{
ID: user.ID, Email: user.Email, Name: user.Name,
})
}
Anti-Patterns
Wrong: Generic repository exposing IQueryable/QuerySet
// BAD -- leaks ORM query builder through the interface
public interface IRepository<T> {
IQueryable<T> GetAll(); // Callers build arbitrary queries
T GetById(int id);
void Add(T entity);
}
// Service code now contains ORM-specific LINQ:
var users = repo.GetAll()
.Where(u => u.IsActive)
.Include(u => u.Orders) // EF-specific!
.ToListAsync();
Correct: Domain-specific methods with materialized results
// GOOD -- interface exposes domain operations only
public interface IUserRepository {
Task<User?> FindByIdAsync(int id);
Task<IReadOnlyList<User>> FindActiveUsersAsync();
Task SaveAsync(User user);
}
// Implementation handles all ORM details internally
Wrong: Repository wrapping another repository
// BAD -- repository delegates to another repository with no added value
class CachedUserRepository implements UserRepository {
constructor(
private innerRepo: UserRepository,
private otherRepo: UserRepository
) {}
async findById(id: string) {
return this.innerRepo.findById(id); // Just passes through
}
}
Correct: Decorator with caching adds real value
// GOOD -- caching decorator adds genuine cross-cutting concern
class CachedUserRepository implements UserRepository {
constructor(private delegate: UserRepository, private cache: Cache) {}
async findById(id: string): Promise<User | null> {
const cached = await this.cache.get(`user:${id}`);
if (cached) return cached;
const user = await this.delegate.findById(id);
if (user) await this.cache.set(`user:${id}`, user, 300);
return user;
}
}
Wrong: Fat repository with business logic
# BAD -- repository validates, calculates, and sends emails
class OrderRepository:
def place_order(self, order):
if order.total < 0:
raise ValueError("Invalid total") # Business rule!
order.tax = order.total * 0.2 # Calculation!
self.session.add(order)
self.session.commit()
send_confirmation_email(order) # Side effect!
Correct: Repository only persists, service handles logic
# GOOD -- repository does one thing: persist
class SqlOrderRepository:
def save(self, order: Order) -> None:
model = self._to_model(order)
self._session.merge(model)
self._session.flush()
# Business logic lives in the domain/application layer
class OrderService:
def place_order(self, order: Order) -> None:
order.validate() # Domain logic
order.calculate_tax() # Domain logic
self.order_repo.save(order) # Persistence only
self.email_service.send(order) # Separate concern
Common Pitfalls
- One repository per table instead of per aggregate: Creates tight coupling to the database schema and breaks aggregate boundaries. Fix: identify aggregate roots in your domain model and create one repository per root. [src1]
- Returning ORM entities instead of domain objects: Services become coupled to the ORM. If you switch databases, every service changes. Fix: add a
toDomain()/toModel()mapper in the repository implementation. [src6] - Generic repository with unused methods:
IRepository<T>forcesDelete()on entities that should never be deleted, violating Interface Segregation. Fix: define specific interfaces per aggregate with only the methods that aggregate needs. [src4] - Putting Unit of Work logic inside the repository: Repositories should not call
commit()/SaveChanges(). Fix: let the application service or a separate Unit of Work coordinate transactions. [src3] - Testing against the database through the repository: The whole point is to enable in-memory fakes. Fix: always write domain logic tests with fake/stub repositories, reserve integration tests for the real implementation. [src6]
- Repository becoming a query service: Adding dozens of
findByXmethods for reporting queries. Fix: use CQRS -- repositories for writes, lightweight query objects or direct SQL for reads. [src2]
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Domain has complex business rules and aggregates | Simple CRUD with <5 entities and no domain logic | ORM directly (Active Record or Data Mapper) |
| You need to unit test domain logic without a database | Rapid prototyping or throwaway code | Direct database queries |
| Multiple data source backends are realistic (SQL, NoSQL, API) | Single database that will never change | ORM repository (e.g., Spring Data, Django ORM) |
| Team follows DDD or hexagonal architecture | Read-heavy analytics/reporting queries | Query objects or CQRS read side |
| Aggregate boundaries need strict enforcement | Microservice with a single entity and no joins | Thin data access functions |
| You need a caching decorator or audit logging layer | Framework already provides testable abstractions | Framework-provided test utilities |
Important Caveats
- The repository pattern adds an abstraction layer -- in simple apps this is unnecessary complexity. If your ORM already provides a clean, testable interface, adding repositories may be over-engineering. [src4]
- Generic repositories (
IRepository<T>) almost always become a leaky abstraction. Prefer specific repository interfaces per aggregate root. [src4] - In CQRS architectures, repositories are typically only used on the write/command side. The read/query side bypasses repositories entirely for performance. [src1]
- The pattern originated in Eric Evans' Domain-Driven Design (2003). Its value is highest when your domain model diverges significantly from your database schema.
- When using ORMs like Entity Framework or SQLAlchemy, the ORM session/DbContext already implements a Unit of Work. Adding another UoW layer on top can create confusion about who owns the transaction.