Internationalisation Take 2 - Zend vs Cheap-o Arrays

Saturday, June 23rd, 2007

In response to my emails to the Zend Framework I18N list and my previous post, Thomas, the author of the Zend_Translate framework items mailed back to the list here:

> 2) gettext is a more expensive version of using the arrays backend.

No… it’s a less expensive version. What takes time is reading the original
source. Your processor is always faster than your harddrive.
It is better to do some computations than reading a bigger file. And mo
files are much smaller than the same sized array files.

This still seems wrong to me, so I’ve done a bit more analysis. I have now got XDebug up and running in my portable environment, so I can really see the details of the costs. Now, to caveat all this, I’m running all this from a USB key on a laptop that’s doing a number of other background tasks, so, the performance is not isolated. Due to this I’ll be looking at percentages of time in Wincachegrind, not actual execution times.

Now, to test this I have generated two files. One of which is a .po containing 1000 phrases which I have compiled to a .mo file. The other is a PHP array in a PHP file containing the same 1000 translations. I generated this with a script, the translations are a bit simple:

From the .po file:

msgid “String 0″
msgstr “String Translated 0″

From the .php file:

‘String 0′=> ‘String Translated 0′,

I have then written a simple PHP file which translates 50 of these items. A reasonable enough test I think. Firstly, to test the translation using the fast Zend_Translate gettext options:

require_once 'Zend/Translate.php';
i = new Zend_Translate('gettext', '/development/language/test.en.mo', 'en');
 
function _($s)
{
    global $i;
    $s = $i->_($s);
    echo($s."<br/>\n");
}
 
_('String 1');
_('String 2');
...

I then ran this file and loaded the cachegrind output into WinCacheGrind. 87.88% of the execution time was spent in Zend_Translate_Adapter_Gettext->_loadTranslationData. Performing translations took 1.99% of the time.

Next I used my PHP array and the Zend_Translate array backend:

require_once 'Zend/Translate.php';
require_once '/development/language/test.en.php';
$i = new Zend_Translate('array', $LANGUAGE, en);

(The rest of the file remains the same). I then ran this and checked the output. loadTranslationData took 78.36% of the time. Performing translations took 2.86% of the time.

My third test was just to use the test.en.php file and a simple translation function:

require_once '/development/language/test.en.php';
function _($s) {
  global $LANGUAGE;
   $s = (array_key_exists($s, $LANGUAGE)) ? $LANGUAGE[$s] : $s;
  return $s;
}

The first thing to note was that the Zend_Translate items took over 20ms each. This one not using Zend at all took 2.8ms. The require_once statement took 1.83% of that time. Then it was just repetition of an un-recordably-fast translation 50 times.

So what do I draw from this? I draw from this that for simple translations, you can’t beat a very, very simple system with just an array of translations. I haven’t looked in any depth at the other services offered by Zend_Translate, but it does allow you to add multiple translations and translate in multiple languages. But, do you have a use-case for that?

If your UI needs to display in a single language, but translate that language, take the simple approach. It needs a little extension to support modular languages, but look at the PHPBB3 implementation and you can’t go far wrong. That loads modular translation files (just to keep that trivial require_once cost down) each of which array_merge’s back into a single translation array which is key’d by constants.

Fast.

My cachegrind files for your reading pleasure:
Zend_Translate - Array
Zend_Translate - Gettext
Non-Zend_Translate

Popularity: 78% [?]

Profiling PHP With XDebug - Portably!

Saturday, June 23rd, 2007

When you are working on a web site or web application, something that little thought is spared for at development time all too often is it’s performance. You need to know where your bottlenecks are and what the costs of each architectural and implementation choice you make are. You should routinely profile your application’s core functions to see how they behave. How do we do this? You could put little timers into the code and log timings to see how things work, or you could use XDebug.

XDebug is a PHP extension that provides a number of critical features to developers. It supports step through debugging (assuming your code editor can hook into it) and it supports profiling of your scripts. This gives you a detailed breakdown of every command executed in your application, how many times it was executed and what it cost.

This is invaluable when tracing your performance work. You can identify exactly which routines are costing you too much. It can give valuable insight into the performance landscape of your software. So I’m going to hook it up into my development environment on it’s USB key.

Firstly, pop over to the XDebug site and download the relevant version for your PHP version. I downloaded the Windows Binary of 2.0RC4 for PHP 5.2.1+. XDebug is a “Zend Extension”, it’s not a standard PHP Extension, it extends the Zend Engine that powers PHP. This changes how we configure it in php.ini and means it doesn’t have to go on the extensions folder of your PHP install. But, I keep it there for consistency. Once placed in that file you need to edit your php.ini, the commands can go anywhere in the file, but, I placed them after the PHP Extension loading commands, again for consistency:

zend_extension_ts=/development/php/ext/php_xdebug-2.0.0rc4-5.2.1.dll

Note that this is loaded with the zend_extension_ts command instead of the extension command (the ts denotes Thread Safe mode) . Also note that we specify the full path to the extension. The zend_extension_ts (and other zend_extension commands) need the full path as they don’t pay attention to our extension directory command.

Once this is done, go to your PHPInfo() test page and check, you should have XDebug information included:

XDebug Enabled

Ok so far so good. If you have a page which currently throws a php error, go check it now. You’ll notice that just having XDebug installed gives you much more information. XDebug makes developing easier just by being there.

Now, we’re mainly going to use it to profile performance of our applications and third party libraries, so we need to enable profiling, this is done with a couple of new entries in php.ini. I placed them just after my command to load the extension so it’s all in one place:

xdebug.profiler_enable = 1
xdebug.profiler_output_dir=/development/

Now restart Apache and hit your PHPInfo() page. In the development folder on your USB Key you will have some files called cachegrind.out.[some number]. This is the profiling information in it’s raw form and of no use to you on it’s own.

You need a cachegrind analysis program. I use Wincachegrind as I’m on windows. You can use this to open up the cachegrind file and see what took what time. Visit yourPHPInfo() page, pick up the cachegrind output and take a look. You can see a lot of detail.

CacheGrind

I don’t propose to detail how to use Wincachegrind and do a full analysis, a bit of poking should show you what’s going on. But, I’ll be using XDebug and WinCacheGrind to get under the covers of some third party libraries I’m considering for use in the development of Multiblog, so we’ll see more information then.

Popularity: 61% [?]