Archive for the 'Development' Category

Summer of Code 2008

Tuesday, March 18th, 2008

The Geeklog Project (for which I am a Core Team Member) has been confirmed as a participating organisation in the Google Summer of Code 2008. We have our ideas list up and are recruiting for students.

So, if you are a student at university and interested in a Free Google T-Shirt and Some Money for contributing to a PHP Open Source Project, then check it out. I’ve identified three projects that I will be mentoring on if suitable students are found, and there are a number of other projects available if you hate me ;-)

Popularity: 31% [?]

Faster, Easier Wordpress Upgrade

Tuesday, February 5th, 2008

In the last 38 days there have been two urgent security fix releases of Wordpress. In the last 3 and a tiny bit months there have been three maintenance releases of Wordpress. In the last 4 and a tiny bit months there have been four releases of Wordpress.

Other than the fact I’m starting to get pretty worried about the security and stability of the software in general, it’s a pain in the rear to have to keep upgrading. So I’m making it easier for me. Wordpress themselves have helped by having a decent system in place for making it easy to get the latest.

I now have the simplest of shell scripts which:

  1. Backs up my database.
  2. Backs up my Wordpress folder.
  3. Gets the latest Wordpress release.
  4. Unpacks that release.
  5. Deploys that release live.

Being nice, I’m going to share it with you:

mysqldump --host=localhost --user=wordpress --password=wordpress wordpress > wordpress.sql
tar -zcf wordpress_backup.tgz wordpress_live
wget http://wordpress.org/latest.tar.gz
tar -xzf latest.tar.gz
cp -r wordpress/* wordpress_live/
rm -r wordpress

Of course this assumes that you have a wordpress database in a localhost MySQL instance with username and password wordpress and that your live wordpress folder is wordpress_live so you can cope with a temporary wordpress folder from the unpack. It also assumes that mysqldump, tar and wget are available in your shell.

Also, I don’t just do this on live. I back up my live, put it on my portable instance and test the new version first. Then I do it on live. Then I update the versions of my plugins.

What an arse. This is why I prefer Geeklog. It’s more secure and doesn’t change at an alarming rate.

Now I can SSH into my server and type ./upgradewordpress.sh when I’m ready then hit http://inanger.com/[secretlocationofadmin]/wp-upgrade.php and finish things off. Job done. I still have a pain in the rear as I have to test the release locally first (./upgradewordpress.sh on local instance of course, after restoring a fresh backup of live into it and adjusting the config to refer to my local instance).

And I think this is less risky than tracking Wordpress via SVN on live.

Popularity: 57% [?]

Database Change Control

Monday, February 4th, 2008

K. Scott Allen recently published a five part series on the importance of version control in creating and maintaining the database behind your product. This starts with something pretty important and fundamental, three rules for database work. Rules #2 and #3 are vital. No argument there. Rule #1 I’ll come to later. Jeff Atwood has also written on the subject before, and highlighted his previous post and a few further comments in support of what K. Scott Allen had written in an apparently unscheduled post on his blog. Looking at the trackbacks and comments on the posts, this appears to have generated a lot of interest, and I feel compelled to critique the posts.

Firstly, you must get your database under version control if you ever plan on releasing more than one version. You must have an authoritative source of schema and procedures etc, you must do versioned releases. This is not in contention. But other points in K’s series and Jeff’s unquestioning support for this are.

Secondly, some background, I’m a Development Manager for a Microsoft Technology Stack web based application which is maintained and released as a product. We have a lot of tables, a lot of stored procedures and other database entities. And a lot of developers.

Taking the approaches outlined by K in his posts on Change Scripts and Views, Stored Procedures and the Like on face value, they are hopeless. They’ll get you into trouble. Fast.

K’s process is one script for the database schema and one script per stored procedure in the master definition. He then references Phil Haack’s Bullet Proof Sql Change Scripts post for an idea of how to provide better change programs than his simplistic ones.

Phil’s stuff is good in principle, making sure that your SQL Change Scripts can execute many times, but, they are laborious. We’ve rolled all the checks you could possibly want into a set of functions dbo.ColumnExists(TableName, ColumnName). dbo.IndexExists(IndexName, TableName) etc.

But with a system like ours with thousands of stored procedures (literally) and doing a re-deploy of all sps is painful. So we script only the updates and roll them out. But that will be a lot of updates in a release designed to take any version of our oldest supported version up to our newest major release (iApplication XP Panoramic Web 2.0 Version). Executing each of these individually and tracking the results is a hassle.

The next step is to roll these scripts together into one to make a quick deployment, just get the installer/dba to execute “Update_1.1.0.1.sql” on the target database and there you go. *

Only that’s when it gets really problematic. SQL Query Optimiser pulls the rug out from under your feet. What it will do is change the order of execution to optimise it. It’ll create new tables, add rows and script stored procedures at times to suit it’s own optimisation desires. And then everything will fail. Columns won’t exist, so stored procedures won’t compile. Views will fail. Commands to add indexes will explode. There will be bits of schema all over the place.

So then you have to wrap every alter table command, every DROP usp_ and CREATE usp_ in sp_executeSQL. Which means you have to escape every ‘ character correctly. So then you need to write a tool to generate the change scripts cleanly.

Then, you find that if you are executing hundreds of changes in batch in a single change program, you’ll find that it’s hard to stop a change logging itself when only part has fallen over, so then you have to wrap each item in a check for @@ERROR to see if anything went wrong. And wrap the whole lot in a transaction and roll that back if any errors happened at all.

Right, so now we have something approaching bullet proof. Now we need to make sure our team of developers do it right every time. We write a process document and run a training session. We explain how each development starts with getting the latest SP/Table definition from Source Control making your changes, generating your change program. That your change program must be tested and shown to execute cleanly leaving the full trace etc, and must contain the right assigned version number.

Whoever writes this change script will test it thoroughly and against a variety of test data, then commit the change script into source control. The schema change is officially published. The schema change will start to appear in developer workspaces as they update from source control, and on test machines as new builds are pushed into QA and beyond.

That just doesn’t work. You will despair as even your best guru programmers, the ones who really care about their craft, make mistakes and take short-cuts because the process is onerous and is seen as a tax. So then you need to have a nightly build process that restores a clean database, executes every change against it, checks the results and the database. That runs a parse on the change program beyond the SQL Execution to make sure the rules are adhered to and Change_1_0_1_0_2.sql actually records itself as 1.0.1.0.2 and not 2.0.1.3.4 which is it’s number on a different branch etc etc. K’s text is incredibly hand-wavy.

1. Never use a shared database server for development work.

The convenience of a shared database is tempting. All developers point their workstations to a single database server where they can test and make schema changes. The shared server functions as an authoritative source for the database schema, and schema changes appear immediately to all team members. The shared database also serves as a central repository for test data.

Like many conveniences in software development, a shared database is a tar pit waiting to fossilize a project. Developers overwrite each other’s changes. The changes I make on the server break the code on your development machine. Remote development is slow and difficult.

Avoid using a shared database at all costs, as they ultimately waste time and help produce bugs.

So having a single instance per developer on their local machine is a panacea for all your “shared database” problems. All those bugs created by people over-writing each other’s schema changes etc.

What about when a developer neglects to update their local instance and produces a fix based on an out-dated schema or stored procedure? Same thing. How do you ensure all your developers are keeping their local instance sufficiently up to date? Auto-update their instances with each commit as it’s stabilised? What if that wipes out their changes?

I’m not saying a shared database solves these issues, or that you won’t run into the issues he mentions on your shared database. But it’s only one point to control. We periodically re-stabilise our test and development databases where developers are patching their work for peer testing. We’re working on improving this process all the time. Some development requires an isolated environment to avoid breaking everyone’s ability to work, and they do have isolated local environments, but only for the length of that development.

Database version management is an incredibly hard problem to resolve. And although getting people started with it as K and Jeff have done is good, it’s not enough. You have to go further. And if you’ve gone further than us, please tell me where we go next!

* - (Using isql with the right voodoo to supress all the line numbers and pointless messages just leaving us the completion state, piping the output to a text file which we can parse to ensure that everything completed AOK and the database is not left in an inconsistent state, prior to us then needing to validate that the database IS actually in a decent state and the scripts haven’t falsely reported success…)

Popularity: 44% [?]

It’s About Breadth

Thursday, January 31st, 2008

I was reading a blog entry about hot technology in Java over at Manageability.org. The second paragraph in the entry slapped me out of my non-blogging frenzy with it’s wrongness.

It’s been suggested that Polyglot programming be in the list. Even though I do subscribe to the notion that learning other languages are beneficial to one’s craft, it simply is not pragmatic advice. It is not practical to recommend that someone study Ruby, Groovy, Scala and who knows what other language is vying for your attention. Stick to a couple of languages and do it well. Some languages are better than others for certain tasks. However the biggest fallacy of all is that, a dynamic language is not considerably better than a static one. It’s no magic bullet.

The main thing to note, is the paragraph is not completely wrong. The author does note that “the notion that learning other languages are beneficial to one’s craft”, but unfortunately caveats that with it’s just not pragmatic or practical and that the reader should stick to a “couple” of languages.

I strongly disagree.

As a professional programmer, in your day job you should code in one technology set. Note I say technology set, not language. For some that may be one language. For others that may be several. For me, the last time I was a hands-on-programmer as my day job, that was Javascript, CSS, XHTML, ASP(VBScript), Visual Basic 6 and T-SQL.

You should strive for a deep and complete depth of understanding of that technology set. This should clearly start with a basic competency of the limited set of those technologies that relate to the product/project you are working on. But you should deepen and broaden this understanding as fast and well as humanly possible.

You should know how to do anything that VB6 can do, not just within the context of your web development. You will need to learn the aspects of VB6 programming that can never be used in a web context, but along the way, you will learn many things you would not have otherwise learnt. These things may be things you can directly use in a web context, or things that just improve your approach to problem solving, design and development issues, a fresh perspective on the language.

The next step from here is to take that solid grounding in your primary weapon and mature and expand it with exposure to other languages and technologies.

The development communities around each language are akin to separate nations. Sometimes diplomatic channels are open and citizens freely move between the nations. Other times there is open hostility. Each nation has it’s own way of life. There is always some common ground between all languages, but, often between them they have vastly differing ways of approaching a common problem. Continual exposure to these different languages opens you up to more ways to solve the problems you are faced with. You will be able to deal with a vastly wider range of problems as a result.

And this is a critical skill to develop.

Do not let yourself become an island nation. Have that deep mastery of a key language/technology set and use it daily, but make sure you are also constantly looking around for other languages and ideas to broaden your understanding of your craft. Travel widely. Use the languages for “real” in anger development to really understand the different pain points. Ruby may solve one pain for you at the cost of other deeper pains.

Without you, and people like you, doing this language tourism, building this breadth, there won’t be a new top 5 interesting technologies in [whatever language] in 2009, because the [whatever language] community will stagnate as it examines it’s own navel.

Popularity: 47% [?]

Caching Using Zend

Monday, October 22nd, 2007

The Zend Framework provides an interesting set of PHP5 libraries for caching. There’s a nice architecture to it, providing a number of different backends and frontends for caching. However, I recently found it very frustrating to try and figure out a decent caching strategy for the new version of a site I was working on.

And the documentation did not help at all. So, allow me to elaborate for the benefit of the huddled masses.

The introduction in the caching section of the manual gives a decent enough overview of the basics, if I want to cache a page with a nice simple ID, such as “Page1″ with a set lifetime I can do so in a few lines of code.

However, another page goes on to mention how you can also “tag” records with multiple tags. Another page talks about how to clear the cache by a single tag, or combination of tags.

But nothing explains the relationships between tags and ids, and how the clear works with tags or ids.

Now, I’m working on a system which has two views of a music catalogue for a radio station. There is the requests system and there is the discography system. They both present differing views of the same data. The request system filters the list of artists, albums and tracks on the station to those that can be requested and displays a “request optimised” set of screens with some of the data. The discography section shows everything about an artist, all their albums, all tracks, reviews and so forth, without the cruft needed for the request sub-system.

We cache these screens for obvious reasons. What we need is the ability to clear the cache of an artist, album or track possibly within the requests or the discography, or both. So, I figured on a system of unique keys like artist_123 and album_123 etc then to use the tags to “lump” things together. So album_123 would have the artist_123 tag in both discography and request view plus the discography tag in the discography and the requests tag in the requests view. Something like:

$cache->load("album_123", array("album_123", "artist_123", "requests", "en_GB", "album");

I could then simply invalidate the cache of all albums, request pages, artist_123 pages or British English generated pages or any combination of those items

This does not work.

The first important thing is that the ID is the unique key. Not the combination of the ID and the tag(s). So if you save page1 to the cache with the tags tag1 and tag2, then try and load page1 from the cache with tags tag3 and tag4, you’ll get the result of saving with tag1 and tag2!

Insane, but true. Try it. The tag has no effect whatsoever it seems on the load code. If it can find an item by ID, then it loads it, irrespective of tags. I’m not sure if this is behaviour by design, or a bug on my system using the file backend, but it is consistent. I just think it’s mad.

To get what I desire, I’m going to have to cache with:

$cache->load("requests_artist_123_album_123_en_GB", array("requests", "artist_123", "album_123", "en_GB", "albums");

Then I can still do a clear on the appropriate tags, if for example I want to remove all albums from the requests cache:

$cache->clean(Zend_Cache::CLEANING_MODE_MATCHING_TAG, (array("requests","albums"));

It’s a bit of a pain in the rear, but, once you’ve figured out the $tags argument on the load method is pointless, it’s fine.

Of course, figuring all this out was further compicated by the fact that the examples in the manual often use load() with $key and $tags and save() with no arguments. I assumed therefore that the point of the $tags argument on load() was to set the tags that would be used auto-magically on save(). Only, if you don’t pass $tags to save() it saves with no tags. Which is also silly, since it does respect the $key used in the load() method.

Popularity: 81% [?]

The Wrong Answer to the Right Question?

Saturday, September 29th, 2007

I’m often faced with the need to do a one off crunch of data to provide answers to questions management ask about raw data. Not the kind of thing they’ll be asking for on a regular basis. Just a need to scratch an individual itch. One off reports on specific aspects of code metrics. Calculate some predictions of data growth in the application across different aspects of it’s user base.

On these occasions, I either turn to Query Analyser to mine our SQL Server databases directly, or use the data import tool in Excel and try and crunch the data swiftly in that. Sometimes, those needs become a long running need to manage some data, where Excel is often the preferred format, because I can do some initial crunching and manage the data in there and the rest of management can then take a copy and further manipulate it and play with it to get additional information as and when it occurs to them they need it.

The problem I face is that Excel is designed for accountants and management types with no programming knowledge to manage spreadsheets of data they understand. It’s too damn user friendly. I find it very hard sometimes to find a good way to manage my data in Excel. I often throw my hands up in dispaire and lash up a software tool specifically to manage the data. I’m talking about a full on database driven web application in most cases. It’s so much faster for me to work with the data that way, and I can then use the Data Import tool in Excel to shove the data in raw forms into spreadsheets if the people asking for the data want to take it away and play with it.

There has to be a better way. There has to be a more productive way for me to do this. A more developer focussed tool for doing this, that allows you to achieve with scripting/programming what you would achieve in Excel by mucking around with excessively user friendly wizards and obscure dialogue screens.

John Udell thinks the answer might look something like Resolver, which is a new spreadsheet application written in Python that allows you to use Python directly in cells and to have full access to .NET and IronPython through the whole application.

This just seems to be the wrong solution to me. With Excel we have a spreadsheet product that is so good it’s destroyed all competition that non-programmers use and love. It can be extended by programmers with add-ins and macros. You can write .NET code or VBA code (easy for non-programmers to learn) in the Macros etc. However, the formulae are restricted to the old style “icky” functions. Stick=if(condition,forumlae,formulae) in, which just makes programmers recoil from the keyboard in horror.

The right answer to the question is to have a simple option to enable direct access to the .NET runtime in cells. Then people can code formulae in any .NET enabled language they choose, including IronPython.

Do not throw the baby out with the bathwater.

Popularity: 71% [?]

Internationalisation Take 2 - Zend vs Cheap-o Arrays

Saturday, June 23rd, 2007

In response to my emails to the Zend Framework I18N list and my previous post, Thomas, the author of the Zend_Translate framework items mailed back to the list here:

> 2) gettext is a more expensive version of using the arrays backend.

No… it’s a less expensive version. What takes time is reading the original
source. Your processor is always faster than your harddrive.
It is better to do some computations than reading a bigger file. And mo
files are much smaller than the same sized array files.

This still seems wrong to me, so I’ve done a bit more analysis. I have now got XDebug up and running in my portable environment, so I can really see the details of the costs. Now, to caveat all this, I’m running all this from a USB key on a laptop that’s doing a number of other background tasks, so, the performance is not isolated. Due to this I’ll be looking at percentages of time in Wincachegrind, not actual execution times.

Now, to test this I have generated two files. One of which is a .po containing 1000 phrases which I have compiled to a .mo file. The other is a PHP array in a PHP file containing the same 1000 translations. I generated this with a script, the translations are a bit simple:

From the .po file:

msgid “String 0″
msgstr “String Translated 0″

From the .php file:

‘String 0′=> ‘String Translated 0′,

I have then written a simple PHP file which translates 50 of these items. A reasonable enough test I think. Firstly, to test the translation using the fast Zend_Translate gettext options:

require_once 'Zend/Translate.php';
i = new Zend_Translate('gettext', '/development/language/test.en.mo', 'en');
 
function _($s)
{
    global $i;
    $s = $i->_($s);
    echo($s."<br/>\n");
}
 
_('String 1');
_('String 2');
...

I then ran this file and loaded the cachegrind output into WinCacheGrind. 87.88% of the execution time was spent in Zend_Translate_Adapter_Gettext->_loadTranslationData. Performing translations took 1.99% of the time.

Next I used my PHP array and the Zend_Translate array backend:

require_once 'Zend/Translate.php';
require_once '/development/language/test.en.php';
$i = new Zend_Translate('array', $LANGUAGE, en);

(The rest of the file remains the same). I then ran this and checked the output. loadTranslationData took 78.36% of the time. Performing translations took 2.86% of the time.

My third test was just to use the test.en.php file and a simple translation function:

require_once '/development/language/test.en.php';
function _($s) {
  global $LANGUAGE;
   $s = (array_key_exists($s, $LANGUAGE)) ? $LANGUAGE[$s] : $s;
  return $s;
}

The first thing to note was that the Zend_Translate items took over 20ms each. This one not using Zend at all took 2.8ms. The require_once statement took 1.83% of that time. Then it was just repetition of an un-recordably-fast translation 50 times.

So what do I draw from this? I draw from this that for simple translations, you can’t beat a very, very simple system with just an array of translations. I haven’t looked in any depth at the other services offered by Zend_Translate, but it does allow you to add multiple translations and translate in multiple languages. But, do you have a use-case for that?

If your UI needs to display in a single language, but translate that language, take the simple approach. It needs a little extension to support modular languages, but look at the PHPBB3 implementation and you can’t go far wrong. That loads modular translation files (just to keep that trivial require_once cost down) each of which array_merge’s back into a single translation array which is key’d by constants.

Fast.

My cachegrind files for your reading pleasure:
Zend_Translate - Array
Zend_Translate - Gettext
Non-Zend_Translate

Popularity: 79% [?]

Profiling PHP With XDebug - Portably!

Saturday, June 23rd, 2007

When you are working on a web site or web application, something that little thought is spared for at development time all too often is it’s performance. You need to know where your bottlenecks are and what the costs of each architectural and implementation choice you make are. You should routinely profile your application’s core functions to see how they behave. How do we do this? You could put little timers into the code and log timings to see how things work, or you could use XDebug.

XDebug is a PHP extension that provides a number of critical features to developers. It supports step through debugging (assuming your code editor can hook into it) and it supports profiling of your scripts. This gives you a detailed breakdown of every command executed in your application, how many times it was executed and what it cost.

This is invaluable when tracing your performance work. You can identify exactly which routines are costing you too much. It can give valuable insight into the performance landscape of your software. So I’m going to hook it up into my development environment on it’s USB key.

Firstly, pop over to the XDebug site and download the relevant version for your PHP version. I downloaded the Windows Binary of 2.0RC4 for PHP 5.2.1+. XDebug is a “Zend Extension”, it’s not a standard PHP Extension, it extends the Zend Engine that powers PHP. This changes how we configure it in php.ini and means it doesn’t have to go on the extensions folder of your PHP install. But, I keep it there for consistency. Once placed in that file you need to edit your php.ini, the commands can go anywhere in the file, but, I placed them after the PHP Extension loading commands, again for consistency:

zend_extension_ts=/development/php/ext/php_xdebug-2.0.0rc4-5.2.1.dll

Note that this is loaded with the zend_extension_ts command instead of the extension command (the ts denotes Thread Safe mode) . Also note that we specify the full path to the extension. The zend_extension_ts (and other zend_extension commands) need the full path as they don’t pay attention to our extension directory command.

Once this is done, go to your PHPInfo() test page and check, you should have XDebug information included:

XDebug Enabled

Ok so far so good. If you have a page which currently throws a php error, go check it now. You’ll notice that just having XDebug installed gives you much more information. XDebug makes developing easier just by being there.

Now, we’re mainly going to use it to profile performance of our applications and third party libraries, so we need to enable profiling, this is done with a couple of new entries in php.ini. I placed them just after my command to load the extension so it’s all in one place:

xdebug.profiler_enable = 1
xdebug.profiler_output_dir=/development/

Now restart Apache and hit your PHPInfo() page. In the development folder on your USB Key you will have some files called cachegrind.out.[some number]. This is the profiling information in it’s raw form and of no use to you on it’s own.

You need a cachegrind analysis program. I use Wincachegrind as I’m on windows. You can use this to open up the cachegrind file and see what took what time. Visit yourPHPInfo() page, pick up the cachegrind output and take a look. You can see a lot of detail.

CacheGrind

I don’t propose to detail how to use Wincachegrind and do a full analysis, a bit of poking should show you what’s going on. But, I’ll be using XDebug and WinCacheGrind to get under the covers of some third party libraries I’m considering for use in the development of Multiblog, so we’ll see more information then.

Popularity: 60% [?]

Internationalisation

Thursday, June 21st, 2007

The web is global.

Lots of websites do not cope with this. They do not provide a user setting for the language and deliver their content in that language.

Clearly, this is bad. If you are producing an application, like Multiblog, then you need to make it international. It needs the UI at least (content is a more thorny issue) to work in the users preferred language. Otherwise they will experience friction trying to use the confusing foreign thing.

There are a lot of ways to achieve this. Geeklog and PHPBB3 use arrays to translate content and allow the user to pick things. Drupal uses the GNU GetText system. And there are other approaches.

Choosing the right approach and using it correctly is difficult. I’m currently experimenting with approaches for Multiblog and other projects. I’m currently looking into the very interesting Zend Framework’s Zend_Translate class. This allows a number of different approaches, including both Array and GetText.

GetText appears to be the recommended choice. There are a number of free tools that can generate your translation files, as the translation files are not human readable. It’s fast and threadsafe. The Zend Framework Manual offers some advice on how to structure your translation files. There are several suggested methods, but, there is no suggestion of how to structure your translation modules.

The question I asked was “What’s the best practice?”, and no-one seems to know, so I guess I need to figure it out for myself from basic principles.

Now, GetText was written to provide internationalisation for the GNU software. Including the core of the Linux OS. Here, the GetText file is (I assume) parsed once at start up and held in memory to translate as things go. Web applications are different. Every page view is essentially a new start up. That GetText translation source is going to be loaded hundreds and thousands of times. Not just once on boot of the web server.

So, if we want to get this right, we need to know what our best bet is. Do we want a monolithic all translations file, or do we want to modularise this file and load it as needed? Does it use the file like a database and seek things out, or does it parse the whole thing every time and process it internally?

I’ve done some simple testing. I produced a basic test catalogue with poEdit and compiled a mo file from it:

msgid ""
msgstr ""
"Project-Id-Version: Test Zend GetTextn"
"POT-Creation-Date: n"
"PO-Revision-Date: 2007-06-21 12:20-0000n"
"Last-Translator: THEMike n”
“Language-Team: n”
“MIME-Version: 1.0n”
“Content-Type: text/plain; charset=utf-8n”
“Content-Transfer-Encoding: 8bitn”
“X-Poedit-Language: Englishn”
“X-Poedit-Country: UNITED KINGDOMn”
“X-Poedit-SourceCharset: utf-8n”
msgid "This is a test."
msgstr "[Translated]This is a test.[/Translated]“

I then wrote a simple test harness PHP file which loads a Zend_Translate using gettext and translates a single line. Before performing a translation, I var_dump the Zend_Translate instance to see what’s in it:

  <?php
  /* Configuration: */
  define('PATH_TO_ENGINE', '/development/engine/');
  define('PATH_TO_LANGUAGE', '/development/language/');/* Put engine on the include path */
$curPHPIncludePath = ini_get( 'include_path' );
if (defined( 'PATH_SEPARATOR')) {
    $separator = PATH_SEPARATOR;
} else {
    // prior to PHP 4.3.0, we have to guess the correct separator ...
    $separator = ';';
    if( strpos( $curPHPIncludePath, $separator ) === false ) {
        $separator = ':';
    }
}
if (ini_set('include_path', PATH_TO_ENGINE . $separator . $curPHPIncludePath) === false){
        die('Buggered');
}
require_once 'Zend/Translate.php';
$t = new Zend_Translate('gettext', PATH_TO_LANGUAGE.'test.en.mo', 'en');
echo('<pre>');var_dump($t);echo("</pre><hr/>n");echo($t->_('This is a test.'));?>

The result of the var_dump being:

object(Zend_Translate)#1 (1) {
  ["_adapter:private"]=>
  object(Zend_Translate_Adapter_Gettext)#2 (6) {
    ["_bigEndian:private"]=>
    bool(false)
    ["_file:private"]=>
    resource(21) of type (stream)
    ["_locale:protected"]=>
    string(2) “en”
    ["_languages:protected"]=>
    array(1) {
      ["en"]=>
      string(2) “en”
    }
    ["_options:protected"]=>
    array(1) {
      ["clear"]=>
      bool(false)
   }
    ["_translate:protected"]=>
    array(1) {
      ["en"]=>
      array(2) {
        [""]=>
        string(339) “Project-Id-Version: Test Zend GetText
POT-Creation-Date:
PO-Revision-Date: 2007-06-21 12:20-0000
Last-Translator: THEMike
Language-Team:
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Poedit-Language: English
X-Poedit-Country: UNITED KINGDOM
X-Poedit-SourceCharset: utf-8
”
        ["This is a test."]=>
        string(40) “[Translated]This is a test.[/Translated]”
      }
    }
  }
}

As you can see, before I’ve even called a translate call, the entire mo translation catalogue has been loaded into memory and parsed internally to form a PHP array. Which is then used for translation.

Clearly, this indicates a very modular translation system. I would want a core.lang.mo file for “common” translations used througout the application and then a controller.lang.mo file for each controller that had that controller’s specific phrases in it which is only loaded by that controller.

However, note that the translation is done to a PHP array. Essentially, it seems the gettext translator is really a front-end loader of the array translator. So why not use the array translator?

The only downside I can see is that it’s harder to get non-programmers to generate valid PHP arrays when supplying your translation. Other than that, anything that the PHP extension does to optimise compilation and processing of PHP code will kick in and give you a significantly improved performance. Put extra things on top of that like the Zend Optimisers and so forth, and you have a compelling reason to use highly modular array based translation.

The problem then remains getting valid PHP array files back from your translators, and frankly, that can be solved by writing a simple front end for your translators so they have a GUI to use.

Popularity: 100% [?]

Like Loosing a Limb

Thursday, June 21st, 2007

My internet connection at home is gone. I’m migrating ADSL ISP’s and it appears my old ISP has cut me off and my new ISP is not quite ready for me. “Due to demand, expect to wait three business days beyond your activation date”. Last night I tried to write some code.

Now, coding at home is always painful. At work I have dual monitors and a reasonably powerful box with plenty of RAM (for the coding I do, which is very little and mostly scripting, for real programming that my team do, not really powerful enough. But since I’ve wandered up to manage them, it doesn’t matter that my box sucks).

At home, however, I’ve got a 4 year old laptop that cost £300 when I bought it. With a 15″ screen. At home I’m doing the same kind of scripting, but, it’s just less productive due to raw power of machine and lack of screen real estate.

And now I have no internet.

I can’t remember when exactly I made the transition from referencing books (”In a nutshell” or “Pocket reference” books when I needed something, or checking a “Cookbook” for a recipe, or perhaps even using the MSDN CD) to relying 100% on Google and the wider internet. But it seems I’ve done so. And loosing the internet when I’m coding is like loosing a limb.

It’s easy enough to deal with not having the internet to surf and in fact distract me. But not having it to bail me out when I’m stuck is terrible. Jeff Atwood just said, entirely in passing:

I can barely program these days without an active internet connection; I feel crippled when I’m not networked into the vast hive mind of programming knowledge on the internet.

And he’s right. It’s horrible. His main point was about coding with other people so that you can bounce off each other etc, which is also important. But the lack of the internet as the solution to all your dead-ends is even worse.

Popularity: 33% [?]