Decoupling models from the database: Data Access Object pattern in PHP

January 5, 2009 – 1:07 pm Tags: , , , ,

Nowadays it’s a quite common approach to have models that essentially just represent database tables, and may support saving the model instance right to the database. While the ActiveRecord pattern and things like Doctrine are useful, you may sometimes want to decouple the actual storage mechanism from the rest of the code.

This is where the Data Access Object pattern comes in. It’s a pattern which abstracts the details of the storage mechanism – be it a relational database, OO database, an XML file or whatever. The advantage of this is that you can easily implement different methods to persist objects without having to rewrite parts of your code.

I’m again going to use the programming language quiz game I wrote as an example. Since I initially wrote it to use Doctrine ORM directly, and both the old and new code are available, you can easily see how the code was improved.

A look at the pattern


The basic idea is that you have a Data Access Object (DAO), which abstracts the storage mechanism. It should implement an interface, so if you want to add another way of accessing data, you simply write a new set of DAO’s that implement the interface.

While you could use the pattern alone, it’s probably best to use it together with the factory pattern. That way, we have a single class which handles the creation of the DAO’s – the DAO factory. We could even have multiple factories to handle creation of different kinds of DAO’s, and use the abstract factory pattern to hide the implementation details.

The image on the right displays how the abstract factory version of this could work. Basically, you have an interface HelloDao, which is then implemented by your concrete DAO objects. You also have an interface for the factory, and the concrete factory/factories implement it.

The example later in this post, however, uses just a single factory, as we only implement a single set of DAO’s.

The models

The models in the quiz are relatively simple. We have Question, Answer and QuestionAnswer. At first, these were just Doctrine’s models. However, for implementing the DAO stuff, we’ll create some new model classes of our own.

Here’s the Quiz_Question model class:

<?php
class Quiz_Question
{
	private $_id;
	private $_question;
 
	public function getId()
	{
		return $this->_id;
	}
 
	public function setId($id)
	{
		if(!is_numeric($id))
			throw new InvalidArgumentException('ID can not be non-numeric');
 
		$this->_id = $id;
	}
 
	public function getQuestion()
	{
		return $this->_question;
	}
 
	public function setQuestion($question)
	{
		if(empty($question))
			throw new InvalidArgumentException('Question can not be empty');
 
		$this->_question = $question;
	}
}

Quite simple. It just stores an ID and the question’s text, and provides getters and setters. The setters contain a bit of validation logic that throws an exception if the data is bad. There is absolutely no persistence-related logic here.

The other model classes follow this same approach.

Creating a DAO for the question model

Here’s an interface for the DAO that will be used to deal with Questions:

<?php
interface App_Dao_Question_Interface
{
	public function insertQuestion(Quiz_Question $question);
	public function findQuestion($id);
	public function fetchRandomQuestionNotIn($questionIdList);
	public function findQuestionByQuestion($question);
}

You can probably see the basic idea there: The interface provides methods for accessing the data, as the name Data Access Object says.

Since I still want to use Doctrine, I’m going to create a DAO which uses Doctrine’s methods to implement the above interface:

<?php
class App_Dao_Doctrine_Question implements App_Dao_Question_Interface
{
	/**
	 * Insert a question into the database
	 * @param Quiz_Question $question
	 */
	public function insertQuestion(Quiz_Question $question)
	{
		//Create a Doctrine model for storing the Quiz_Question model
		$q = new Question();
		$q->question = $question->getQuestion();
 
		try
		{
			$q->save();
			$question->setId($q->id);
		}
		catch(Doctrine_Exception $e)
		{
			throw new App_Dao_Exception('Error inserting question', $e);
		}
	}
 
	/**
	 * Find a question by ID
	 * @param int $id
	 * @return Quiz_Question
	 */
	public function findQuestion($id)
	{
		try
		{
			$data = Doctrine::getTable('Question')->find($id);
		}
		catch(Doctrine_Exception $e) 
		{
			throw new App_Dao_Exception('Error querying the database', $e);
		}
 
		if(!$data)
			return null;
 
		$question = new Quiz_Question();
		$question->setId($data->id);
		$question->setQuestion($data->question);
 
		return $question;
	}
 
	/* Rest omitted to save space */
}

Here are two methods from the interface implemented in the DAO. You can see how the methods basically hide away the fact that it’s Doctrine behind the scenes. Particularily, note the try-catch blocks – they are used to catch any implementation-specific exceptions from Doctrine, which get wrapped into App_Dao_Exceptions. This way you can easily catch exceptions from the DAO’s, without knowing what it’s doing behind the scenes.

Creating the factory

Now we have some models, and DAOs that interact with the storage (the database in this case). Next, we need the DaoFactory.

Strictly speaking, this class isn’t absolutely necessary. We could create new DAO’s in our code without the factory, but it might not be a very good idea. If some DAO’s need more initialization, for example passing in a database connection or something, it would be problematic.

Also, if we wanted to change the DAOs, we would need to go around our code and modify it to create other kinds of instances. With the factory, we could simply modify (or even replace) the factory and all code will follow.

class App_DaoFactory
{
	private static $_instance;
 
	public function __construct()
	{
	}
 
	/**
	 * Set the factory instance
	 * @param App_DaoFactory $f
	 */
	public static function setFactory(App_DaoFactory $f)
	{
		self::$_instance = $f;
	}
 
	/**
	 * Get a factory instance. 
	 * @return App_DaoFactory
	 */
	public static function getFactory()
	{
		if(!self::$_instance)
			self::$_instance = new self;
 
		return self::$_instance;
	}
 
	/**
	 * Get a Question DAO
	 * @return App_Dao_Question_Interface
	 */
	public function getQuestionDao()
	{
		return new App_Dao_Doctrine_Question();
	}
}

The factory is set up a bit similar to the singleton: it has a static method for getting an instance of the factory. This is so we can easily obtain a factory instance in our code, but we also provide a setter method, so if needed, we can replace the factory with something else – for example a mock factory for testing code, as can be seen in the new unit tests for App_QuizHandler (scroll down in the code).

This could be easily changed into an abstract factory too. The getFactory method could return PdoDaoFactories, DoctrineDaoFactories or something else – perhaps based on a parameter that’s passed in, or a configuration value or something. But since the abstract factory isn’t necessary (we could just change the get*Dao methods), I have left it out.

DAO in action

With all the classes done, we can now move to actually using them in code. Here’s a very simple example:

$questionDao = App_DaoFactory::getFactory()->getQuestionDao();
 
$question = $questionDao->findQuestion(10);

Relatively simple, no?

Let’s check out a few examples from the quiz on how the code was changed. For example, if we look at the App_Questioner class, methods getRandomQuestion or getBestAnswer, we can see that there’s a bunch of Doctrine-related code there. Looking at the refactored App_Questioner, we see it’s mostly replaced by calls to the DAO’s. Also, if you check out the DAO’s code, you can see the code was basically just moved to the DAO method with minor changes.

Summing it up

While going straight Doctrine models may be quick, DAO’s provide flexibility where it’s needed, and you could easily write some tools to generate the DAO classes for you. Of course, it’s a layer of complexity, but it’s good to know different ways of solving problems.

What comes to the model class shown, and the fact that there is no database related logic… It could contain some additional methods, for example getAnswers, which could return relevant answers. If we provided a such method, it should use the DAO for the answers to fetch them.

The approach with DAO’s is a bit similar to what Matthew Weier O’Phinney shows in his model infrastructure blogpost. I recommend checking it out, as it also contains some more in-depth things about models themselves.

The Core J2EE patterns: Data Access Object is also a good read on the topic.

The source code for the quiz can be found in SVN, as follows:
Version 1.0 of the quiz
Version 1.1 of the quiz (with DAO’s and some other refactoring)

Further reading:
Another idea for using models with forms
The problems faced by a common model interface in frameworks
Food for thought: utilizing models in MVC

Share this:
  1. 34 Responses to “Decoupling models from the database: Data Access Object pattern in PHP”

  2. Very interesting stuff. I’ve been experimenting with something similar, specifically to try and move more functionality out of the Controllers in my applications and into more of a Model.

    One thing I’m quite interested in is subsequently mocking the DAOs for unit testing, i.e. where your controller may do App_DaoFactory::getFactory(), my controllers would have a factory instance injected into them.

    Have you done anything like that?

    By Ciaran McNulty on Jan 5, 2009

  3. Injecting the factory is one possible future improvement in the quiz code.

    For now, the unit tests just mock the factory, and use setFactory. See QuizHandlerTest.php

    What comes to dependency injection, you might want to check out Sphicy by Benjamin Eberlei. It seems like a good approach to dealing with DI.

    By Jani Hartikainen on Jan 5, 2009

  4. Since I switch from php 4 to php 5, I keep wondering why one wants to push a dynamic language to behave as a static one?

    Is it for performance? no, cause as more classes, interfaces, … more includes == slower.

    So why?

    By PHP Programmer on Jan 5, 2009

  5. I know you’re not touching on MVC in this post, but one option I’ve played with recently in ZF is injecting the DAO factory (which I tend to call a MapperFactory) into the Controllers using a Controller Plugin.

    I’ll try and solidify it into some coherent thoughts / concrete code / a blog post at some point.

    By Ciaran McNulty on Jan 5, 2009

  6. PHP Programmer,

    Because it’s possible? ;)

    You could probably throw away the interfaces and just use duck typing, but when the objects need to have a certain set of methods for others to be able to use them, I think an interface is the answer.

    The code is easier to write/maintain/test too.

    The performance hit isn’t that high if you use opcode caches. You can also combine multiple files into one to reduce the overhead of disk IO

    By Jani Hartikainen on Jan 5, 2009

  7. Hi Jani,

    nice post! I think DAO is the most natural way of using OOP.

    Looking at your Quiz_Question class I noticed a lot of redundant code there. You have to implement an accessor method for every field in the class and the validation logic will be duplicated across objects either. I think it will be interesting to use some magic methods in the concrete objects or even AOP to eliminate this duplication.

    Yours,
    sas171

    By sas171 on Jan 5, 2009

  8. Good point. Though depending on the models, your end result in code-duplication department may vary. I think mainly the IDs are what could be validated with the same logic. More complex models might require more model-specific validation rules.

    Anyways, the getters and setters were generated by NetBeans, so I didn’t really write any of it myself (except the validation) ;)

    By Jani Hartikainen on Jan 5, 2009

  9. Hi

    Can you tell me why DAO?
    It boosts decoupling between models, which is terrible approach.

    Look at any UML diagrams. In most cases there is no Peer or DAO – just domain objects.
    So BlogModel will provide methods to retrieve (and populate if necessary) ArticleModels. ArticleModel can initialize (populate) itself or obtain CommentModels and so on.
    What else? You can create sophisticated SearchModel or MenuModel which are totally decoupled from persistence mechanism and schema. Models should reflect application logic.
    IMO DAO is bad idea because it violates internal flow.

    Cheers, Alan

    By Alan on Jan 5, 2009

  10. It depends on the approach taken. You could create an all-powerful domain object, which hides the DAOs – If you take a look at the J2EE pattern page, the diagram there shows a business object using the DAO. In essence, the QuizHandler and Questioner classes in the Quiz are business objects, and they contain most of the DAO logic.

    The only “bad” things about the DAO pattern (in my opinion) is that it adds some complexity, and requires you to write (or generate, if you have a tool) some extra code.

    By Jani Hartikainen on Jan 5, 2009

  11. You make your models anemic. Consider this as smasher :D

    By Alan on Jan 5, 2009

  12. @Ciaran McNulty

    I worked with Alan Brown in 2005 :)The same year you won a book at the PHP conference.

    Sorry Jani for hijacking your post.

    By Federico on Jan 5, 2009

  13. Hi, I like what you are suggesting here. But I would suggest to refactor the App_DaoFactory to several factories cause now it is coupled to all other dao implementations. This will make it more more reusable by other applications.

    Instead I would create QuestionDaoFactory, MyOtherDaoFactory and so on. And simply use it like QuestionDaoFactory::getQuestionDao(). Also it would be nice to add a setter to the factory for the QuestionDao to make it easy to have it replace in unit test and so.

    Thanks for a good post.

    By Johan Nilsson on Jan 8, 2009

  14. This is an old post, but I noticed that you have a couple dependencies in

    class App_Dao_Doctrine_Question implements App_Dao_Question_Interface

    Would you still do this the same way?

    By Stephane on Jul 17, 2012

  15. @Stephane It’s not something that’s absolutely necessary, but I think it’s helpful in that it allows you to use parameter typehints or other things like that and tells you the generic interface the class will have.

    By Jani Hartikainen on Jul 18, 2012

  16. Hi,

    I seen the you use Quiz_Question as a data transfer object, my question is why didn’t you use doctrine model (eg. $question=new Question() as DTO);

    For example I if i have two tables: users and orders (1 to many ) , doctrine entity User would contain all his orders(lazy load).

    In this case what should return $UserDatao->getUserByPk(1), a custom DTO like (Quiz_question) or doctrine user object?

    I you say custom DTO, in this case i need to map doctrine object to my custom DTO?

    thanks

    By Catalin on Jul 14, 2013

  17. @Catalin

    When using Doctrine, you probably won’t need to use this complex pattern in addition to it in most scenarios. I certainly wouldn’t :)

    By Jani Hartikainen on Jul 14, 2013

  18. @Jani, I don’t really understand your answer. You are saying that if i use doctrin orm, dao is pointless? I m not totally agree with that. For example if first time I use doctrine and after that i want to switch to pdo, with DAO design pattern this is very easy, cause my business logic interacts with doctrine or pdo through DAO layer.

    I don’t know exactly what type of object should doctrine dao return, orm object aka entity (more powerfull), or map orm ombject to a simple domain object or dto.

    By Catalin on Jul 14, 2013

  19. Well if you really want to do it, then you probably should have the DAO hide away the entitymanager, DQL and repository actions.

    What you could return from it is doctrine’s entities directly, since they are pretty much just standard PHP objects. In this case, if you wanted to migrate to plain PDO in the future, you would simply need to write some code to fill an entity from your PDO-fetched dataset.

    By Jani Hartikainen on Jul 14, 2013

  20. I agree that dao should return doctrine entities cause they are similar to simple php object, would be ponitless to create a dto in which i copy doctrine entity.

    But I would like to know how you would handle the following problem.

    For example if I have 3 tables Users -> Orders -> OrdersProducts. (1 to many , 1 to many )

    If I use userDao to get user entity, theb i can do:
    $userOrdersEntity=$userEntity->getOrders ():
    $ordersProductsEntity= $userOrsersEntity-> getOrdersProducta ();
    I can do this thanks to doctrine lazy load functions.

    So if i switch to pdo then my business logic will be affected due to lack of join objects.

    With pdo my first idea is to manually populate join objects, but this would be a pain when having many joins.

    What you suggest ?

    By catalin on Jul 14, 2013

  21. Well in order to get the object to use the relation in the first place, you would need to have an ID. So, either pass the object (containing the ID) or simply the ID itself to the function as a parameter. You should be able to use that to fetch any related records.

    By Jani Hartikainen on Jul 15, 2013

  22. I have been using the idea of one class per database table for over a decade, and I have often been told that “this is not the way it’s done by proper OO programmers”. Your opening comment says that it is now quite common, so it’s good to see that this approach is not so outrageous after all.

    Like you I also perform all communication with the underlying database in a separate Data Access Object, but with some differences:

    - You seem to advocate a separate DAO for each database table whereas I have a single DAO for each DBMS (MySQL, PostgreSQL, Oracle, SQL Server).

    - You have a separate finder method for each way of identifying what data is to be retrieved whereas I have a single getData($where) method where the single argument is a string which can be used as the WHERE clause in an sql SELECT statement. I find this more flexible as I can specify whatever selection criteria I like without having to create a specific method.

    This one-class-per-database-table approach allows me to generate my model classes from the database schema. Each of these classes is quite small as a huge amount of generic code is inherited from an abstract table class.

    I discuss this topic in more detail in http://www.tonymarston.net/php-mysql/table-oriented-programming.html

    I have built an open source framework which implements these concepts – see http://www.radicore.org

    By Tony Marston on Sep 5, 2013

  23. I was mostly using Doctrine 2 with my PHP stuff. I think what’s most important is that the architecture you choose is something that suits the application you’re developing.

    For example, if you know for sure you’re just building something simple, you might not need to use a DAO layer at all, as there would be little benefit from it I think. But on the other hand, if you’re building a large program which may be maintained for years to come, it is certainly possible that having the layer could be handy.

    By Jani Hartikainen on Sep 5, 2013

  24. The only saving you get from the effort of writing reusable code comes from the number of times that you reuse it. My largest PHP application has over 250 database tables, over 450 relationships, and over 2,000 user transactions. I wrote this using my framework which also contains 40+ reusable Controllers, a single View object which uses XSL transformations from a reusable library of 12+ XSL stylesheets, and one reusable DAO for each DBMS engine.

    It took me a while to write all that reusable code – which has been available in an open source framework called RADICORE at http://www.radicore.org since 2006 – but it now means that I can create new user transactions in any database application at a very rapid rate.

    By Tony Marston on Sep 5, 2013

  25. Great post! Only it seems that the downloadable examples are gone, I would love to have them as a reference.

    Keep up the good work!

    By Stefan on Jun 3, 2014

  1. 10 Trackback(s)

  2. Jan 11, 2009: » Models im Zend Framework (I) - comfuzius - Fachchinesisch für Entwickler
  3. Jan 15, 2009: Maks All Read - PHP « Umumi Hela ve Özgürlük
  4. Jan 23, 2009: Three PHP mistakes that will cause you debugging nightmares | CodeUtopia
  5. Mar 4, 2009: Creating a simple abstract model to reduce boilerplate code | CodeUtopia
  6. May 7, 2009: Validating Zend_Forms using model objects | CodeUtopia
  7. Jun 26, 2009: Decoupling models from the database: Data Access Object pattern in PHP « Web Under Construction
  8. Jun 26, 2009: ZF Tutorial 3 – The Model | codemusings.com
  9. Jul 4, 2009: Unit testing 4: Mock objects and testing code which uses the database | CodeUtopia
  10. Apr 18, 2014: Models im Zend Framework › Qblog
  11. Jul 30, 2014: Must Read Links : Architecture and Design Patterns | PHP, MySQL, Javascript, AJAX, HTML and CSS

Post a Comment

You can use some HTML (a, em, strong, etc.). If you want to post code, use <pre lang="PHP">code here</pre> (you can replace PHP with the language you are posting)