Categories
Design Patterns

Why I love the repository pattern

What quirks have I found in using it?

Firstly, you may have noticed that I spoke of the domain and command model specifically when talking about the pattern.

Repositories are not intended for adding more query or batch methods. If your repository starts to look like this:

// Avoid this!
interface AccountRepository {
    public function getById(string $id): Account;
    public function getByRegistrationDate(DateTimeInterface $date): array;
    public function save(Account... $accounts): void;
    public function deleteRegisteredBeforeDate(string... $ids): void;
}

then you’re heading for a bad place. You need to separate your command model from your query model (read: CQRS), and the repositories are designed to be used by the domain and command models. You’ll find that the repository interface remains pretty stable and doesn’t need much adjustment as time goes on, but the query model will change much more frequently because there are far more ways that clients/customers will want to query their data than save their data.

You don’t need to go so far as having separate databases for reads and writes, however. You just need to ensure that:

  1. Your query (and batch) methods sit on a different interface(s) to the repository
  2. Your query-specific methods should not use the domain entities, but should have its own view models (which do not modify application state and are purely data structures without service methods)

So an example would be:

namespace Query;

// This Account class lives in the Query namespace and is just for a presentational view in the search results
// e.g. it has no withdraw() method to change its state
class Account
{
    // ...
}

interface SearchAccounts
{
    /**
     * @return []Account
     */
    public function findByRegistrationDate(DateTimeInterface $date): array;
}

You can even let your concrete implementations implement the query interfaces if you want:

class PostgresAccountRepository implements AccountRepository, SearchAccounts
{
    // ...
}

Secondly, languages that use promises for modelling asynchronous behaviour tend to force the interface to expose that asynchrony in the method signature. So if the language would use promises to faciliate asynchronous communication with the database, you can’t really create an interface that hides this implementation detail. For example, with typescript we would have:

interface AccountRepository {
    getById(accountId: string): Promise<Account>;
    save(account: Account): Promise<void>;
}

which isn’t all that bad, but is still a quirk to be aware of. The "test" implementations can use promises that resolve immediately after finding the account within a Map:

class InMemoryAccountRepository implements AccountRepository {
    private accounts: Map<string, Account> = new Map<string, Account>();

    public getById(accountId: string): Promise<Account> {
        const account = this.accounts.get(accountId);

        if (account !== undefined) {
            return Promise.resolve(account);
        } else {
            return Promise.reject(new Error("account not found"));
        }
    }

    public save(account: Account): Promise<void> {
        this.accounts.set(account.id, account);
        return Promise.resolve();
    }
}

Note: with async notation it is almost identical:

class InMemoryAccountRepository implements AccountRepository {
    private accounts: Map<string, Account> = new Map<string, Account>();

    public async getById(accountId: string): Promise<Account> {
        const account = this.accounts.get(accountId);

        if (account !== undefined) {
            return account;
        } else {
            throw new Error("account not found");
        }
    }

    public async save(account: Account): Promise<void> {
        this.accounts.set(account.id, account);
    }
}

Lastly, we have naming conventions. This is entirely personal, and comes down mainly to disagreements when working on shared codebases over how a named instance variable for a repository should be.

You’ve probably noticed above that in WithdrawFunds I named the instance variable as though it were just a collection. So I don’t have $accountsRepository, but just $accounts. I do this because it’s in the definition of the repository pattern that it gives the illusion of a collection, and suffixing it -Repo or -Repository kinda ruins that illusion. I like it when names of classes, interfaces, methods and variables communicate their original intent where possible.

Understandably, you may be thinking: but if you’re dealing with an explicitly in-memory collection? If I have a class like AccountCollection, which is meant to deal with operations on in-memory accounts, won’t that cause confusion with AccountRepository? Especially when I have a method that uses both the collection and the repository!

I would say in response:

  • Most of the time, the need for a -Collection class is borne from the language itself not supporting generics. If you have a generic Collection class, you can get the functionality by instantiating a Collection<Account> when the language supports it. It is unfortunate for some languages that don’t yet support it (e.g. PHP), but it is what it is.
  • Assuming that you have modelled your entities using well-designed aggregates, you are very unlikely to do use both a -Repository and a -Collection for the same entity type in the same method. You should not need to search for (and persist) multiple of the same kind of entity within the same action (with the obvious exception being batch update operations, and that is a meatier topic for another day!). The -Collection will most likely be used in the query model instead (since it will likely involve searching/filtering for multiple entities of the same kind)

Hopefully you’ve enjoyed this look into the repository pattern, and I hope it helps you with your own projects!