Saturday, November 7, 2015

Doctrine tutorials & Doctrine performance: Eager loading vs custom DQL

Some good articles to understand Doctrine, starting with relations between objects to performance (lazy loading vs eager loading), good practices.

http://www.krueckeberg.org/notes/relationships.html   - Doctrine's objects relationships explained
http://www.krueckeberg.org/notes/d2.html     - other important aspects
http://www.uvd.co.uk/blog/some-doctrine-2-best-practices/
https://tideways.io/profiler/blog/5-doctrine-orm-performance-traps-you-should-avoid


OK, now let's see how my classifieds bundle  can benefit from all this knowledge :)

Requirement:  display all ads.

Ads table contains fields like title, description price, a FK pointing to the currency table, FK for category, FK for SpecificItems.
SpecificItems table- contains the Value(a value can pe 1992 for Type "Year of manufacture") and a FK pointing to SpecificItemsTypes
SpecificItemsTypes table- contains the SpecificItemType name

 Example of specific items types: cars ads should have as specific item type "year of manufacture", "color", "cylindrical capacity" etcetera.

In controller I retrieve all data from Ads and SpecificItems using built in findAll() method from Doctrine.

public function showAllAction()
    {
        $em = $this->getDoctrine()->getEntityManager();
        $allAds = $em->getRepository('CPANAClassifiedsBundle:Ads')->findAll();
        $allSpecificItems = $em->getRepository('CPANAClassifiedsBundle:SpecificItems')->findAll();

        if (!$allAds) {
            throw $this->createNotFoundException('Unable to find categories.');
        }
       
        return $this->render('CPANAClassifiedsBundle:Pub\Ads:show_all.html.twig', array( 'all_ads' => $allAds, 'all_items' => $allSpecificItems ) );
    }


In the view  I am joining the information from  2 tables (Ads and SpecificItems) by the common Ads.id value. For the  moment the scope here is just to see some data on the screen using Doctrine generated entities.

view:

    {% for ad in all_ads %}
        <tr><hr></tr>
        <tr><h1> {{ ad.getTitle() }}</h1></tr>
        <tr><p><b>Posted by: </b> {{ ad.getIdUser().getName() }}</p></tr>
        <tr><p><b>Category: </b> {{ ad.getIdCategory().getCategoryName() }}</p></tr>
        <tr><p><b>Area: </b> {{ ad.getIdCityArea().getCityAreaName() }}</p></tr>
        <tr><p><b>City: </b> {{ ad.getIdCityArea().getIdCity.getCityName() }}</p></tr>
        <tr><h2>Description</h2></tr>
        <tr><p>{{ ad.getDescription() }} </p></tr>
        {% for item in all_items %}
            {% if item.getIdSpecificItem() == ad.getIdAds() %}
                <tr><p><b>{{ item.getIdSpecificItemType().getItemTypeName() }} </b> -- {{ item.getItemValue() }} </p></tr>
               
            {% endif %}
        {% endfor %}
       
        <tr><p><b>Price: </b> {{ ad.getPrice() }} -- <b> {{ ad.getIdCurrency().getCurrencyName() }} </b> </p></tr>
    {% endfor %}

-------------------------------------------------



It works! Just that the smart profiler from Symfony is showing to me that I did 13 queries in order to display my 2 ads from Database!!!!Weird, I thought I did only 2 queries????

Click on the icon and Symfony is giving more details:


Let's see what happened:

I queried first the "ads" table and secondly the "specific_items" as I was expecting from doing it in the controller.
Next I queried the User table. Well not me, Doctrine was smart enough to do that by lazy loading the User object from the Database when I asked  in the view for "ad.getIdUser().getName()". Doctrine is not loading by default the information from dependent tables because sometimes it will result in some massive data load, slowing down the application. On the other hand doing 13 separate queries is also bad for performance.
 

If I delete this line "ad.getIdUser().getName()" from view we will see that Doctrine will not query any more the table Users for getting the associated values.
From documentation on how to query objects:

7.8.4. By Eager Loading

Whenever you query for an entity that has persistent associations and these associations are mapped as EAGER, they will automatically be loaded together with the entity being queried and is thus immediately available to your application.

7.8.5. By Lazy Loading

Whenever you have a managed entity instance at hand, you can traverse and use any associations of that entity that are configured LAZY as if they were in-memory already. Doctrine will automatically load the associated objects on demand through the concept of lazy-loading.

Possible solution 1:  change fetch mode to EAGER in many-to-one unidirectional relationships

From documentation:
http://doctrine-orm.readthedocs.org/en/latest/reference/annotations-reference.html#manytoone

    Defines that the annotated instance variable holds a reference that describes a many-to-one
    relationship between two entities.

    Required attributes:

        targetEntity: FQCN of the referenced target entity. Can be the unqualified class name if both classes are in the same namespace. IMPORTANT: No leading backslash!

    Optional attributes:

        cascade: Cascade Option
        fetch: One of LAZY or EAGER
        inversedBy - The inversedBy attribute designates the field in the entity that is the inverse side of the relationship.

    Example:

    <?php
    /**
     * @ManyToOne(targetEntity="Cart", cascade={"all"}, fetch="EAGER")
     */
    private $cart;



In my case I want to fetch eager the User object associated with the ad.

    /**
     * @var \AppBundle\Entity\User
     *
     * @ORM\ManyToOne(targetEntity="AppBundle\Entity\User", fetch="EAGER")
     * @ORM\JoinColumns({
     *   @ORM\JoinColumn(name="id_user", referencedColumnName="id")
     * })
     */
    private $idUser;

   
Load the page and magic => 11 queries executed (instead of 13).

 

 Looking into profiler details we can see doctrine generated a left join SQL query between ads and fos_user table.




adding  - fetch="EAGER" - to several more fields found in a many-to-one relation:
 -  in "Ads" entity f: idCurrency, idCityArea, idCategory
 - in "CityAreas" entity: idCity
 -  in "SpecificItems" entity:  idSpecificItemType

                       
Load the page again and great news! Only 3 queries


OK, now let's see how it scales, I copied the ads in phpMyAdmin and now I have 6 of them, when loading the page the number of queries stays constant: 3! Most of the action happens in the below query, there are another 2:  one join between city_areas and cities, and one join between specific items and specific items types.


Possible solution 2:  write custom DQL queries

First delete the fetch="EAGER" from the entities modified previously. Loading the page and we have again 13 queries.

I created  a "Repository" directory under "Entity" here I will add 2 files "AdsRepository.php"  and "SpecificItemsRepository.php"
Do not forget to mention in each entity class  to use the repository:

"Ads.php"
/**
 * Ads
 *
 * @ORM\Table(name="ads", indexes={@ORM\Index(name="fk_id_user", columns={"id_user"}), @ORM\Index(name="fk_id_city_area", columns={"id_city_area"}), @ORM\Index(name="fk_id_currency", columns={"id_currency"})})
 * @ORM\Entity(repositoryClass="CPANA\ClassifiedsBundle\Entity\Repository\AdsRepository")
 */

OK, now let's write the query for retrieving information from Ads object and it's associated objects:

CPANA/ClassifiedsBundle/Entity/Repository/AdsRepository.php
    public function findAllAdsCustom()
    {
        return $this->getEntityManager()
            ->createQuery(
                'SELECT  a.idAds, a.title, a.description, a.price
                ,u.name
                ,cat.categoryName
                ,c.currencyName
                ,ca.cityAreaName
                ,city.cityName
                FROM CPANAClassifiedsBundle:Ads a
                JOIN a.idUser u
                JOIN a.idCategory cat
                JOIN a.idCurrency c
                JOIN a.idCityArea ca
                JOIN ca.idCity city'
               
            )
            ->getResult();
    }


Second query for retrieving information from SpecificItems object:

CPANA/ClassifiedsBundle/Entity/Repository/SpecificItemsRepository.php
    public function findAllItemsCustom()
    {
        return $this->getEntityManager()
            ->createQuery(
                'SELECT  si.itemValue
                        ,IDENTITY(si.idAds)
                        ,ty.itemTypeName
                FROM CPANAClassifiedsBundle:SpecificItems si
                JOIN si.idSpecificItemType ty'
               
            )
            ->getResult();
    }

I am using IDENTITY function for not loading the associated object, just the foreign key value. Initially I used just  si.idAds which resulted in an error "...Error: Invalid PathExpression. Must be a StateFieldPathExpression..."

When I try to load the page, other error:

"Key "idAds" for array with keys "itemValue, 1, itemTypeName" does not exist in CPANAClassifiedsBundle:Pub\Ads:show_all.html.twig at line 16 "

So there is no key "idAds" using IDENTITY it returns the value at key "1".


The view looks like this now:

<table style="width:100%">
    {% for ad in all_ads %}
        <tr><hr></tr>
        <tr><h1> {{ ad['title'] }}</h1></tr>
        <tr><p><b>Posted by: </b> {{ ad['name'] }}</p></tr>
        <tr><p><b>Category: </b> {{ ad['categoryName'] }}</p></tr>
        <tr><p><b>Area: </b> {{ ad['cityAreaName'] }}</p></tr>
        <tr><p><b>City: </b> {{ ad['cityName'] }}</p></tr>
        <tr><h2>Description</h2></tr>
        <tr><p>{{ ad['description'] }} </p></tr>
        {% for item in all_items %}
            {% if item[1] == ad['idAds'] %}
                <tr><p><b>{{ item['itemTypeName'] }} </b> -- {{ item['itemValue'] }} </p></tr>
               
            {% endif %}
        {% endfor %}
       
        <tr><p><b>Price: </b> {{ ad['price'] }} -- <b> {{ ad['currencyName'] }} </b> </p></tr>
    {% endfor %}
</table>
And in the controller I am calling the 2 methods from repositories:

public function showAllAction()
{
        $em = $this->getDoctrine()->getEntityManager();
        $allAds = $em->getRepository('CPANAClassifiedsBundle:Ads')->findAllAdsCustom();
       
        $allSpecificItems = $em->getRepository('CPANAClassifiedsBundle:SpecificItems')->findAllItemsCustom();
                     
        if (!$allAds) {
            throw $this->createNotFoundException('Unable to find categories.');
        }
       
        return $this->render('CPANAClassifiedsBundle:Pub\Ads:show_all.html.twig', array( 'all_ads' => $allAds, 'all_items' => $allSpecificItems ) );
      
    }
When I load the page again I can see that only 2 SQL queries were executed:


3. Conclusion:  


Writing custom  DQL seems to be the  the best way to use Doctrine, it takes a little bit more time to implement  but you are controlling better how Doctrine behaves.

No comments:

Post a Comment