FacebookLinkedInShare

Motivation:

Performance improvements are always welcomed, but more often than not, they are very difficult to achieve. Discovering the root of performance issues, simply by reviewing the code, can be a challenging task for every developer.

Profilers can help you pinpoint a lot of the problems, by providing information on time and memory consumption.

 

Profiler Types:

Flat Profiler –  computes the amount of time spent in each function.

Call Graph Profiler – computes a (you guessed it…) call graph, on top of the Flat Profiler features. The call graph information includes the chain of calls leading to the callee function.

 

Introduction:

In Natural Intelligence, we faced performance issues, and wanted to understand, map, and prioritise them.

From what we discovered, the easiest type of problems to fix, were cases of DB queries that could have been cached but weren’t. Queries on data that isn’t supposed to change during the request, like getting a page name by the page id.

 

Xdebug’s profiler is a Call Graph Profiler, and we will use these features to improve our demo app’s performance.

 

Demo App:

Our demo app will have 2 entities – kids and colors.

Our goal is to echo the kids names alongside their favourite colors.

To put some pressure on the application, I’ll add a loop to do this task 1,000 times, so performance issues will be easier to spot.

This will help us see the benefits of profiling php applications.

 

DB tables:

Kids – id, name, favourite_color

Colors – id, name

 

For demo purposes, I filled the table ‘kids’ with 20 rows, and the table ‘colors’ with 10 rows.

Keep in mind that we’re not going to talk about best practices, so the solution will not be something to aspire to in your apps (to say the least), but rather a very basic example of how you can find performance issues using a code profiler.

 

KidsTable

ColorsTable

 

Installation & Configuration:

We will use kcachegrind as our call graph viewer.

 

$ sudo apt-get install kcachegrind

 

Add xdebug configuration to your php.ini file:

[xdebug]

xdebug.remote_enable = 1

xdebug.remote_port=9000

xdebug.profiler_enable=1

xdebug.profiler_output_dir = “/tmp”

 

Demo App code:

 

 

<?php

require 'config.php';
// $mysqli is defined in config.php

function getAllKidIds(){
    global $mysqli;
    $result = $mysqli->query('SELECT id FROM kids');
    while ($kid = $result->fetch_row()){
        $arr[]=$kid[0];
    }
    return $arr;
}
 

function getKidNameById($id){
    global $mysqli;
    $result = $mysqli->query("SELECT name FROM kids where id = {$id}");
    if ($kid = $result->fetch_row()){
        return $kid[0];
    }
    return null;
}

 

function getFavouriteColorByKidId($id){
    global $mysqli;
    $result = $mysqli->query("SELECT favourite_color FROM kids where id = {$id}");
    if ($kid = $result->fetch_row()){
        return $kid[0];
    }
    return null;
}

 

function getColorNameById($id){
    global $mysqli;
    $result = $mysqli->query("SELECT name FROM colors where id = {$id}");
    if ($color = $result->fetch_row()){
        return $color[0];
    }
    return null;
}

 

echo nl2br('Kids favourite colors:' . PHP_EOL . PHP_EOL);

$kidIds = getAllKidIds();

for ($i = 0; $i<1000; $i++) {
    foreach ($kidIds as $kidId) {
        $kid_name = getKidNameById($kidId);
        $color_id = getFavouriteColorByKidId($kidId);
        $color_name = getColorNameById($color_id);
        echo nl2br($kid_name . ' - ' . $color_name . PHP_EOL);
    }
}

 

I run the script using a browser, hence the nl2br function usage, but of course you can run it as a script and remove these function calls.

 

Restart your php to start the profiler, and run the script.

Now you’re Profiling!

Inside your ‘/tmp’ dir you will find files in the format cachegrind.out.<XXXX>.

<XXXX> will be generated automatically by Xdebug’s profiler.

 

Open the profiler output file using kcachegrind call graph viewer:

$ kcachegrind /tmp/cachegrind.out.<XXXX>

It should look something like this:

 

NoCaching

Incl. column – shows the inclusive time spent in this function.

Self column – shows the time spent in this function, without considering time spent waiting for other functions called by the current one to return.

 

You can toggle between the Relative/Absolute view to see different time representation.

You’ll notice the time sums up to over 100% percent.

It is due to 2 reasons:

  1. Like every great athlete, php gives 110% ;)
  2. The profiler rounds up the time measurements.

 

We can see that the php::mysqli->query function is being called 60,001 times!

The reason being that we call it for every time we want to know a kid’s name, kid’s favourite color, or a color’s name, by the relevant ids…

Remember, our database table only contains 20 (kids) + 10 (colors) = 30 rows!

If we were to cache the results in these very often requests to get information that we don’t expect to change from the DB, we’ll rid ourselves of all these redundant, expensive DB queries.

 

So after a small and easy refactor this is how our demo app code would look like:

 
<?php

require 'config.php';
// $mysqli is defined in config.php

function getAllKidIds(){
    global $mysqli;
    $result = $mysqli>query('SELECT id FROM kids');
    while ($kid = $result>fetch_row()){
        $arr[]=$kid[0];
    }
    return $arr;
}

 

function getKidNameById($id){
    global $mysqli;
    static $kid_names;
    if (isset($kid_names[$id])){
        return $kid_names[$id];
    }
    $result = $mysqli>query("SELECT name FROM kids where id = {$id}");
    if ($kid = $result>fetch_row()){
        $kid_names[$id] = $kid[0];
    return $kid_names[$id];
    }
    return null;
}

 

function getFavouriteColorByKidId($id){
    global $mysqli;
    static $kid_colors;
    if (isset($kid_colors[$id])){
        return $kid_colors[$id];
    }
    $result = $mysqli>query("SELECT favourite_color FROM kids where id = {$id}");
    if ($kid = $result>fetch_row()){
    $kid_colors[$id] = $kid[0];
        return $kid_colors[$id];
    }
    return null;
}

 

function getColorNameById($id){
    global $mysqli;
    static $color_names;
    if (isset($color_names[$id])){
        return $color_names[$id];
    }
    $result = $mysqli>query("SELECT name FROM colors where id = {$id}");
    if ($color = $result>fetch_row()){
        $color_names[$id] = $color[0];
        return $color_names[$id];
    }
    return null;
}

 

echo nl2br('Kids favourite colors:' . PHP_EOL . PHP_EOL);

$kidIds = getAllKidIds();

for ($i = 0; $i<1000; $i++) {
    foreach ($kidIds as $kidId) {
        $kid_name = getKidNameById($kidId);
        $color_id = getFavouriteColorByKidId($kidId);
        $color_name = getColorNameById($color_id);
        echo nl2br($kid_name . ' - ' . $color_name . PHP_EOL);
    }
}

 

 

And now let’s review our profiler’s new output:

 

CachingQueries

 

Notice how the php:mysqli->query function call count was drastically reduced to 51, simply by caching the results from the DB queries in our demo app.

 

The total runtime of our application was reduced to ⅙ the time it originally took.

 

Happy profiling!

 

Further reading:

Profiling (computer programming)

http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29

 

Flat Profiler:

https://sourceware.org/binutils/docs/gprof/Flat-Profile.html

 

KCachegrind (call graph viewer):

http://kcachegrind.sourceforge.net/html/Home.html

 

Xdebug documentation:

http://xdebug.org/docs/all

  • Rotem

    Great post!!