A Tale of Three Repositories

Martin Fowler describes a repository as an object that…

In other words, it’s a higher level abstraction that gives its clients an object into which entities can be put and from which entities can be retrieved. We’re using entity in the domain driven design sense here. Entities are objects that have an identity beyond the values of their attributes.

We’re going to look at three different repositories here, all implementing the same interface and all using different storage mechanisms based on relational databases. We’ll talk about why we focused on RDMS’s at the end. We’ll also discuss the pros and cons of each repository. The goal here is not to make judgements, it’s to provide examples and discussion.

All the code for this tutorial is on github with some tests that show you how to set up and use the repositories.

Our repository is going to deal with an Article, something with a title, body, publish year, and an identifier. We’ll define our article with a header interface.

Article.php

<?php

interface Article
{
    public function getIdentifier();
    public function getTitle();
    public function setTitle($title);
    public function getBody();
    public function setBody($body);
    public function getYear();
    public function setYear($year);
}

And here’s the repository interface.

ArticleRepository.php

<?php

interface ArticleRepository
{
    public function find($id);

    public function findAll();

    public function findByYear($year);

    public function add(Article $article);

    public function remove($article);
}

There are ways to fetch all articles, single articles, and articles for a given year. There’s also methods to add and remove articles from the repository.

Despite its features, there’s no hint given about how the repository actually stores things or whether it does at all. That’s the fun part.

Our first example is a repository backed by PDO directly. This means the repository implementation will have some SQL embedded in it.

Is that bad? In a larger system, probably yes. You’d likely want more of a data mapping layer in between your repository and the database, but for simplier things it’s not such a big deal. This repository blurs the lines a bit between a data mapper and a repository. The full code is on github; I’ll only show part of the object here.

PdoArticleRepository.php

<?php
final class PdoArticleRepository implements ArticleRepository
{
    const TABLE = 'articles';

    /**
     * @var PDO
     */
    private $conn;

    public function __construct(\PDO $conn)
    {
        $this->conn = $conn;
    }

    public function findAll()
    {
        $stm = $this->conn->query(
            $this->getSelect().' ORDER BY publish_year DESC, title ASC'
        );
        $out = $this->statementToObjects($stm);
        $stm->closeCursor();

        return $out;
    }

    public function add(Article $article)
    {
        $id = $article->getIdentifier();
        $params = [
            ':title'    => $article->getTitle(),
            ':body'     => $article->getBody(),
            ':year'     => $article->getYear(),
        ];
        $bind = [
            ':year'     => \PDO::PARAM_INT,
        ];

        if ($id) {
            $sql = 'UPDATE '.self::TABLE.' SET title = :title, body = :body, publish_year = :year WHERE id = :id';
            $params[':id'] = $id;
            $bind[':id'] = \PDO::PARAM_INT;
        } else {
            $sql = 'INSERT INTO '.self::TABLE.' (title, body, publish_year) VALUES (:title, :body, :year)';
        }

        $stm = $this->conn->prepare($sql);
        foreach ($params as $name => $val) {
            $stm->bindValue(
                $name,
                $val,
                isset($bind[$name]) ? $bind[$name] : \PDO::PARAM_STR
            );
        }
        $stm->execute();

        return $id ? $id : intval($this->conn->lastInsertId());
    }

    // may make more sense to use a factory object here, but
    // let's keep this somewhat simple.
    private function toObject(array $row)
    {
        $article = new SimpleArticle($row['id']);
        $article->setTitle($row['title']);
        $article->setBody($row['body']);
        $article->setYear($row['publish_year']);

        return $article;
    }

    private function statementToObjects(\PDOStatement $stm)
    {
        // could very easily use an iterator or a generator here
        // with big data sets, again, let's keep it simple.
        $stm->setFetchMode(\PDO::FETCH_ASSOC);
        return array_map([$this, 'toObject'], $stm->fetchAll());
    }

    private function getSelect()
    {
        return 'SELECT id, title, body, publish_year FROM '.self::TABLE;
    }
}

As you can see, the code is pretty far from concise. SQL based repositories like this get a lot more pretty if they use a bit higher level database abstraction layer like Doctrine DBAL.

There’s also a hidden coupling between the database platform in use (I tested this with SQLite) and the repository. While the SQL may work on multiple database platforms, there’s a good chance it won’t depending on the complexity. A better name for this would be PdoSqliteArticleRepository.

On the flip side, this abstraction is relatively easy to understand. PDO isn’t complicated, just verbose. Simple SQL backed repositories also give you a lot of flexiblity to make queries of other tables, do joins, or whatever you like. Patterns like this also scale up really well as the size the dataset gets bigger, especially using things like generators or custom iterators for collections.

Active Record is a design pattern in which a single object is tied to a database table. This includes all the storage mechanisms for that object.

The beautiful part about active record is that it’s simple. Need to save an object, just call $obj->save(). Delete is just $obj->delete(). Querying is just calling a static method. Unfortunately that get’s kind of hard to test. You’re stuck mocking real objects, rather than domain interfaces (unless you want your domain interfaces to include save etc.). Or your stuck doing integration testing only on any part of your application that uses the active record objects.

That said, active record’s testing limitations can be mostly mitigated using a repository. We’ll use Laravel’s Eloquent for this example.

I did put the Article implementation in the last example, but with Eloquent it’s important that it extends Illuminate\Database\Eloquent\Model.

EloquentArticle.php

<?php
use Illuminate\Database\Eloquent\Model;

class EloquentArticle extends Model implements Article
{
    // configures some stuff for Eloquent
    public $timestamps = false;
    protected $table = 'articles';
}

And here’s the repository. Again, the full code is on github.

EloquentArticleRepository.php

<?php
final class EloquentArticleRepository implements ArticleRepository
{
    public function findAll()
    {
        return EloquentArticle::query()
            ->orderBy('year', 'DESC')
            ->orderBy('title', 'ASC')
            ->get();
    }

    public function add(Article $article)
    {
        if (!$article instanceof EloquentArticle) {
            throw new \InvalidArgumentException(sprintf(
                '%s expects and instance of %s, got "%s"',
                __METHOD__,
                EloquentArticle::class,
                get_class($article)
            ));
        }

        $article->save();

        return $article->id;
    }
}

A ton more concise than the PDO version of things. It also does a good job of hiding away our static method calls, so clients of the repository can mock the interface in tests and not rely on global state. There’s also no coupling to a specific database platform.

The goal of this thing is to hide some of the “bad” design of active record and make testing a bit easier. With most applications that’s probably not going to be a huge concern and a repository like this might be overkill. Like everything else in software, it depends.

The repository itself is not very testable, it’s something that likely needs integration tests only. Which is 100% okay, I would only do integration tests with the PDO version as well.

Unlike Eloquent, Doctrine ORM uses the repository pattern internally. Creating a custom repository is a matter of some configuration and extending Doctrine\ORM\EntityRepository. We don’t need to do anything special with our Article entity object, just a plain old PHP object is fine.

Doctrine’s repository objects are read only by default, and you use the entity manager itself to persist and delete things. Our custom implementation breaks that rule a bit.

Here’s the implementation, the full code is on github.

DoctrineArticleRepository.php

<?php
use Doctrine\ORM\EntityRepository;

final class DoctrineArticleRepository extends EntityRepository implements ArticleRepository
{
    // EntityRepository provides this method
    // but we override it to get ordering
    public function findAll()
    {
        return $this->findBy([], [
            'year'  => 'DESC',
            'title' => 'ASC',
        ]);
    }

    public function add(Article $article)
    {
        $em = $this->getEntityManager();
        $em->persist($article);
        $em->flush(); // probably not a good idea in a larger app

        return $article->getIdentifier();
    }
}

Doctrine entities need to be configured with XML, YAML, or Docblock Annotations. I used XML.

PMG.ThreeRepositories.SimpleArticle.dcm.xml

<doctrine-mapping xmlns="http://doctrine-project.org/schemas/orm/doctrine-mapping"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://doctrine-project.org/schemas/orm/doctrine-mapping
                    http://raw.github.com/doctrine/doctrine2/master/doctrine-mapping.xsd">

    <entity name="PMG\ThreeRepositories\SimpleArticle" table="articles" repository-class="PMG\ThreeRepositories\DoctrineArticleRepository">
        <id name="id" type="integer">
            <generator strategy="AUTO" />
        </id>
        <field name="title" type="text" />
        <field name="body" type="text" />
        <field name="year" column="publish_year" type="integer" />
    </entity>

</doctrine-mapping>

Again, this one is pretty concise and should work with multiple database platforms. Doctrine itself encourages the use of repositories already, so we’re not adding too much to what doctrine does out of the box. The important thing here is that we’re implementing an interface we own, giving us the freedom to swap out implementations without depending on third party code.

Like the other two repositories, this one isn’t very testable. We’d probably want to do integration tests only.

The most important thing about any repository implementation is that it hides the details of how things get done behind an interface. Clients of the repository don’t and shouldn’t care if its backed by Eloquent, Doctrine, PDO, or some NoSQL backend. That’s why we focused on RDMS’s here: because it doesn’t matter.

The best example I can give of this is the test case I used for these three repositories.

TestCase.php

<?php

abstract class TestCase extends \PHPUnit_Framework_TestCase
{
    protected $repo;

    public function testArticlesCanBePersistedUpdatedFetchedAndRemoved()
    {
        $this->assertEmpty($this->repo->findAll());
        $this->assertEmpty($this->repo->findByYear(2015));

        $article = $this->createArticle();
        $article->setTitle('Hello');
        $article->setBody('World');
        $article->setYear(2015);

        $id = $this->repo->add($article);

        $this->assertCount(1, $this->repo->findAll());
        $this->assertEmpty($this->repo->findByYear(2014));
        $this->assertCount(1, $this->repo->findByYear(2015));

        $article = $this->repo->find($id);
        $this->assertInstanceOf(Article::class, $article);

        $article->setTitle('changed');
        $this->repo->add($article);

        $article = $this->repo->find($id);
        $this->assertInstanceOf(Article::class, $article);
        $this->assertEquals('changed', $article->getTitle());

        $this->repo->remove($article);
        $this->assertNull($this->repo->find($id));
    }

    abstract protected function createArticle();
}

It’s the same integration test for each. Just the setup varies. The test doesn’t care about the Article or ArticleRepository implementation, just that it can do the things the ArticleRepository contract says it should. That’s the power of the abstraction and why you should think about using a repository to hide the details of your storage system even with an ORM already in place.

Accessibility Tools

PMG Digital Made for Humans

Christopher Davis

Bringing news to you

Related Content

PMG Innovation Challenge Inspires New Alli Technology Solutions

PMG and Bidtellect Partner to Expand PMG’s Centralized Data Within Alli

Applying Function Options to Domain Entities in Go

My Experience Teaching Through Jupyter Notebooks

Trading Symfony’s Form Component for Data Transfer Objects

Working with an Automation Mindset

F8 2019: 3 Powerful Takeaways from this Year’s Facebook Conference

Parsing Redshift Logs to Understand Data Usage

3 Tips for Showing Value in the Tech You Build

Testing React

Tips for Designing & Testing Software Without a UX Specialist

A Beginner’s Experience with Terraform