Archive for the 'Opinion' Category

It’s About Breadth

Thursday, January 31st, 2008

I was reading a blog entry about hot technology in Java over at Manageability.org. The second paragraph in the entry slapped me out of my non-blogging frenzy with it’s wrongness.

It’s been suggested that Polyglot programming be in the list. Even though I do subscribe to the notion that learning other languages are beneficial to one’s craft, it simply is not pragmatic advice. It is not practical to recommend that someone study Ruby, Groovy, Scala and who knows what other language is vying for your attention. Stick to a couple of languages and do it well. Some languages are better than others for certain tasks. However the biggest fallacy of all is that, a dynamic language is not considerably better than a static one. It’s no magic bullet.

The main thing to note, is the paragraph is not completely wrong. The author does note that “the notion that learning other languages are beneficial to one’s craft”, but unfortunately caveats that with it’s just not pragmatic or practical and that the reader should stick to a “couple” of languages.

I strongly disagree.

As a professional programmer, in your day job you should code in one technology set. Note I say technology set, not language. For some that may be one language. For others that may be several. For me, the last time I was a hands-on-programmer as my day job, that was Javascript, CSS, XHTML, ASP(VBScript), Visual Basic 6 and T-SQL.

You should strive for a deep and complete depth of understanding of that technology set. This should clearly start with a basic competency of the limited set of those technologies that relate to the product/project you are working on. But you should deepen and broaden this understanding as fast and well as humanly possible.

You should know how to do anything that VB6 can do, not just within the context of your web development. You will need to learn the aspects of VB6 programming that can never be used in a web context, but along the way, you will learn many things you would not have otherwise learnt. These things may be things you can directly use in a web context, or things that just improve your approach to problem solving, design and development issues, a fresh perspective on the language.

The next step from here is to take that solid grounding in your primary weapon and mature and expand it with exposure to other languages and technologies.

The development communities around each language are akin to separate nations. Sometimes diplomatic channels are open and citizens freely move between the nations. Other times there is open hostility. Each nation has it’s own way of life. There is always some common ground between all languages, but, often between them they have vastly differing ways of approaching a common problem. Continual exposure to these different languages opens you up to more ways to solve the problems you are faced with. You will be able to deal with a vastly wider range of problems as a result.

And this is a critical skill to develop.

Do not let yourself become an island nation. Have that deep mastery of a key language/technology set and use it daily, but make sure you are also constantly looking around for other languages and ideas to broaden your understanding of your craft. Travel widely. Use the languages for “real” in anger development to really understand the different pain points. Ruby may solve one pain for you at the cost of other deeper pains.

Without you, and people like you, doing this language tourism, building this breadth, there won’t be a new top 5 interesting technologies in [whatever language] in 2009, because the [whatever language] community will stagnate as it examines it’s own navel.

Popularity: 51% [?]

The Wrong Answer to the Right Question?

Saturday, September 29th, 2007

I’m often faced with the need to do a one off crunch of data to provide answers to questions management ask about raw data. Not the kind of thing they’ll be asking for on a regular basis. Just a need to scratch an individual itch. One off reports on specific aspects of code metrics. Calculate some predictions of data growth in the application across different aspects of it’s user base.

On these occasions, I either turn to Query Analyser to mine our SQL Server databases directly, or use the data import tool in Excel and try and crunch the data swiftly in that. Sometimes, those needs become a long running need to manage some data, where Excel is often the preferred format, because I can do some initial crunching and manage the data in there and the rest of management can then take a copy and further manipulate it and play with it to get additional information as and when it occurs to them they need it.

The problem I face is that Excel is designed for accountants and management types with no programming knowledge to manage spreadsheets of data they understand. It’s too damn user friendly. I find it very hard sometimes to find a good way to manage my data in Excel. I often throw my hands up in dispaire and lash up a software tool specifically to manage the data. I’m talking about a full on database driven web application in most cases. It’s so much faster for me to work with the data that way, and I can then use the Data Import tool in Excel to shove the data in raw forms into spreadsheets if the people asking for the data want to take it away and play with it.

There has to be a better way. There has to be a more productive way for me to do this. A more developer focussed tool for doing this, that allows you to achieve with scripting/programming what you would achieve in Excel by mucking around with excessively user friendly wizards and obscure dialogue screens.

John Udell thinks the answer might look something like Resolver, which is a new spreadsheet application written in Python that allows you to use Python directly in cells and to have full access to .NET and IronPython through the whole application.

This just seems to be the wrong solution to me. With Excel we have a spreadsheet product that is so good it’s destroyed all competition that non-programmers use and love. It can be extended by programmers with add-ins and macros. You can write .NET code or VBA code (easy for non-programmers to learn) in the Macros etc. However, the formulae are restricted to the old style “icky” functions. Stick=if(condition,forumlae,formulae) in, which just makes programmers recoil from the keyboard in horror.

The right answer to the question is to have a simple option to enable direct access to the .NET runtime in cells. Then people can code formulae in any .NET enabled language they choose, including IronPython.

Do not throw the baby out with the bathwater.

Popularity: 72% [?]

Contents of Your Résumé II: Conflicts

Wednesday, September 26th, 2007

Having said only the other week that there is no correct answer to the question of what should go in your résumé, Steve Yegge has chimed in with his article ten tips for a (slightly )less awful resume.

Steve starts out, much like I did by pointing out that what he says only relates to recruiting people for the kinds of places he works, and may not apply outside of the recruitment of technical people for companies that produce their own software, and even within that niche, may not apply to everyone:

I’m just talking about software engineer resumes today, and specifically just the subset intended for applying to companies that build their own software. I have no idea how much (if at all) this stuff applies to resumes for other kinds of positions, or companies. Maybe not much. Sorry!

It seems the companies he’s worked for take a different approach to reviewing CVs to any company I’ve ever worked for. It seems they have a large CV screening panel, phone screens (possibly multiples) and multiple panel interview sessions. We do it differently.

Steve’s first point is right, in bullet form, no-one cares about you yet, but he also suggest stripping out anything with personality in it. Which I disagree with strongly. I do want to know what you’re like other than as a professional software engineer. I recently came across a candidate who was a Reiki healer and psychic reader. Who offered his services for £25 an hour over MSN or email. This information was gleaned from “personal interests” on his CV and listing his personal webpage address. Valuable information saving me from interviewing a total fruit-cake.

Steve’s point 2 asks for unformatted ASCII CVs. This is because where he works they pump them through automated manglers. That is not true anywhere I’ve applied for a job or worked in my life. It’s true that Pimps screw them up, something I was planning on writing about soon, but, in general I’d prefer to see a nicely formatted word doc, that has had some care and attention lavished on it than a nasty plain .txt file.

Points 3, 4 and 5 have merit. Don’t set off my bullshit detector with too much weasel and wank talk. Make sure it’s spelling and grammar checked, if it’s a word doc, and a UK install f Word has put a red squiggle under anything other than an obscure techie term, you’re a moron. But, don’t rely on the spell checked, I’ve seen far too many CVs with spell-checking mistakes.

However, my biggest contention is point 6, certified looser. If you are a contractor, get certifications, do them in your own time, it shows your skill, gives me more confidence you have at least a basic level of understanding of the subject and technology. A safety net. it’s also quite common for UK companies to put people through certification courses to ensure their staff have that rounded skillset, that they know the right way of doing things, and to give the employee something back to move their career on. Certification is great.

But, just a few things there, the certifications, the format of the cv, the lack of personal information. Submit a “perfect for Steve” CV to me and I’ll reject you. Submit a perfect for CV to Steve and he’ll reject you. It’s impossible to get this right. The single best tip in Steve’s article?

You can apply for 18 jobs, but you should send 18 different resumes, each targeted at that job, and you shouldn’t send them all at once.

Tailor it for the job. It’s web development? Emphasis your experience in that area, and trim down the references to the Linux Device Driver work you’ve done. And try and get a feel for the company you’re applying for. If it’s a Big One, like IBM, Microsoft, Amazon, Google etc, there is plenty you can find out about what the recruiters are looking for. Talk to the Pimp, ask them what kind of CV the recruiting managers like to see.

Do your research first, it really helps.

Popularity: 60% [?]

Contents of Your Résumé

Saturday, September 15th, 2007

A very popular question people ask is “what should I put in my C.V?”, the problem is, there is no correct answer to that question. There are a few consistent formats I see to the documents I see, but no set pattern I can figure out behind what has caused people to use that pattern.

Two of the guys I’ve worked with in the past and have equally successful careers, following similar tracks, who are of similar age and position in their career, have totally different approaches. Clive goes for the “everything in a good level of detail” approach. Pete’s CV is two sides, and he thinks it needs trimming down.

I think Pete’s CV comes from the approach that it seems a number of managers take, they want to get the details quickly and don’t want to wade their way through a load of self-promoting rubbish to get to them. Clive’s approach comes from the desire to give all the information so that every base is covered and people won’t miss any of your brilliant skills.

Personally, as a recruiting manager, I like to see a good level of detail on a CV. I want enough to really know whether it’s worth seeing the candidate or no. I want to get a feel for the candidate in advance. I want a chance for them to make mistakes.

But not all managers want that. They follow the Pimp Path, they want to find a few keywords, such as 10+ Years C++ experience on Windows Drivers. Or whatever. If they can’t get this information, they’ll reject the C.V.

The big problem is you don’t know what kind of manager your C.V. is going to.

I’m starting to come down in favour of a hybrid approach. The first page of the C.V. should be a totally lightweight highlights and basic facts page. With detail to follow.

Something that is also important to understand is what information you do need on the C.V., what order and level of detail to present it, and unfortunately that changes based on where you are in your career. For a graduate, clearly education history should come first, it’s the most relevant information. For an experienced engineer, the education history gradually becomes less and less important. But I still feel it’s necessary to include that information.

I think a front page should probably detail personal details, in a small contained area, current role and a brief statement of what you’re looking to achieve in your career. If there is room, a brief summary of key technical skills. Brief please, and the key ones. Not an exhaustive list of everything you’ve ever worked with.

Then, follow this with a career history, education history and some personal stuff.

With the later part, people are rather conflicted on too. Some people think a bit of “personal interests” adds something to a CV, others think it’s a no-no. Which seems odd, because technical skills are not everything, you also need to know if that person is going to fit in with your team and company. Are they going to have the right attitude to work? All information that can be indicated very roughly from the information given there.

So, summary, and then a set of detail. Attempting to keep as many people happy as possible. Really, an example would help. But I don’t have one to hand. And this is the worst article I’ve written ever. I’ll try and come back to it when I’m more alert.

Popularity: 70% [?]

Internationalisation Take 2 - Zend vs Cheap-o Arrays

Saturday, June 23rd, 2007

In response to my emails to the Zend Framework I18N list and my previous post, Thomas, the author of the Zend_Translate framework items mailed back to the list here:

> 2) gettext is a more expensive version of using the arrays backend.

No… it’s a less expensive version. What takes time is reading the original
source. Your processor is always faster than your harddrive.
It is better to do some computations than reading a bigger file. And mo
files are much smaller than the same sized array files.

This still seems wrong to me, so I’ve done a bit more analysis. I have now got XDebug up and running in my portable environment, so I can really see the details of the costs. Now, to caveat all this, I’m running all this from a USB key on a laptop that’s doing a number of other background tasks, so, the performance is not isolated. Due to this I’ll be looking at percentages of time in Wincachegrind, not actual execution times.

Now, to test this I have generated two files. One of which is a .po containing 1000 phrases which I have compiled to a .mo file. The other is a PHP array in a PHP file containing the same 1000 translations. I generated this with a script, the translations are a bit simple:

From the .po file:

msgid “String 0″
msgstr “String Translated 0″

From the .php file:

‘String 0′=> ‘String Translated 0′,

I have then written a simple PHP file which translates 50 of these items. A reasonable enough test I think. Firstly, to test the translation using the fast Zend_Translate gettext options:

require_once 'Zend/Translate.php';
i = new Zend_Translate('gettext', '/development/language/test.en.mo', 'en');
 
function _($s)
{
    global $i;
    $s = $i->_($s);
    echo($s."<br/>\n");
}
 
_('String 1');
_('String 2');
...

I then ran this file and loaded the cachegrind output into WinCacheGrind. 87.88% of the execution time was spent in Zend_Translate_Adapter_Gettext->_loadTranslationData. Performing translations took 1.99% of the time.

Next I used my PHP array and the Zend_Translate array backend:

require_once 'Zend/Translate.php';
require_once '/development/language/test.en.php';
$i = new Zend_Translate('array', $LANGUAGE, en);

(The rest of the file remains the same). I then ran this and checked the output. loadTranslationData took 78.36% of the time. Performing translations took 2.86% of the time.

My third test was just to use the test.en.php file and a simple translation function:

require_once '/development/language/test.en.php';
function _($s) {
  global $LANGUAGE;
   $s = (array_key_exists($s, $LANGUAGE)) ? $LANGUAGE[$s] : $s;
  return $s;
}

The first thing to note was that the Zend_Translate items took over 20ms each. This one not using Zend at all took 2.8ms. The require_once statement took 1.83% of that time. Then it was just repetition of an un-recordably-fast translation 50 times.

So what do I draw from this? I draw from this that for simple translations, you can’t beat a very, very simple system with just an array of translations. I haven’t looked in any depth at the other services offered by Zend_Translate, but it does allow you to add multiple translations and translate in multiple languages. But, do you have a use-case for that?

If your UI needs to display in a single language, but translate that language, take the simple approach. It needs a little extension to support modular languages, but look at the PHPBB3 implementation and you can’t go far wrong. That loads modular translation files (just to keep that trivial require_once cost down) each of which array_merge’s back into a single translation array which is key’d by constants.

Fast.

My cachegrind files for your reading pleasure:
Zend_Translate - Array
Zend_Translate - Gettext
Non-Zend_Translate

Popularity: 79% [?]

Internationalisation

Thursday, June 21st, 2007

The web is global.

Lots of websites do not cope with this. They do not provide a user setting for the language and deliver their content in that language.

Clearly, this is bad. If you are producing an application, like Multiblog, then you need to make it international. It needs the UI at least (content is a more thorny issue) to work in the users preferred language. Otherwise they will experience friction trying to use the confusing foreign thing.

There are a lot of ways to achieve this. Geeklog and PHPBB3 use arrays to translate content and allow the user to pick things. Drupal uses the GNU GetText system. And there are other approaches.

Choosing the right approach and using it correctly is difficult. I’m currently experimenting with approaches for Multiblog and other projects. I’m currently looking into the very interesting Zend Framework’s Zend_Translate class. This allows a number of different approaches, including both Array and GetText.

GetText appears to be the recommended choice. There are a number of free tools that can generate your translation files, as the translation files are not human readable. It’s fast and threadsafe. The Zend Framework Manual offers some advice on how to structure your translation files. There are several suggested methods, but, there is no suggestion of how to structure your translation modules.

The question I asked was “What’s the best practice?”, and no-one seems to know, so I guess I need to figure it out for myself from basic principles.

Now, GetText was written to provide internationalisation for the GNU software. Including the core of the Linux OS. Here, the GetText file is (I assume) parsed once at start up and held in memory to translate as things go. Web applications are different. Every page view is essentially a new start up. That GetText translation source is going to be loaded hundreds and thousands of times. Not just once on boot of the web server.

So, if we want to get this right, we need to know what our best bet is. Do we want a monolithic all translations file, or do we want to modularise this file and load it as needed? Does it use the file like a database and seek things out, or does it parse the whole thing every time and process it internally?

I’ve done some simple testing. I produced a basic test catalogue with poEdit and compiled a mo file from it:

msgid ""
msgstr ""
"Project-Id-Version: Test Zend GetTextn"
"POT-Creation-Date: n"
"PO-Revision-Date: 2007-06-21 12:20-0000n"
"Last-Translator: THEMike n"
"Language-Team: n"
"MIME-Version: 1.0n"
"Content-Type: text/plain; charset=utf-8n"
"Content-Transfer-Encoding: 8bitn"
"X-Poedit-Language: Englishn"
"X-Poedit-Country: UNITED KINGDOMn"
"X-Poedit-SourceCharset: utf-8n"
msgid "This is a test."
msgstr "[Translated]This is a test.[/Translated]"

I then wrote a simple test harness PHP file which loads a Zend_Translate using gettext and translates a single line. Before performing a translation, I var_dump the Zend_Translate instance to see what’s in it:

  <?php
  /* Configuration: */
  define('PATH_TO_ENGINE', '/development/engine/');
  define('PATH_TO_LANGUAGE', '/development/language/');/* Put engine on the include path */
$curPHPIncludePath = ini_get( 'include_path' );
if (defined( 'PATH_SEPARATOR')) {
    $separator = PATH_SEPARATOR;
} else {
    // prior to PHP 4.3.0, we have to guess the correct separator ...
    $separator = ';';
    if( strpos( $curPHPIncludePath, $separator ) === false ) {
        $separator = ':';
    }
}
if (ini_set('include_path', PATH_TO_ENGINE . $separator . $curPHPIncludePath) === false){
        die('Buggered');
}
require_once 'Zend/Translate.php';
$t = new Zend_Translate('gettext', PATH_TO_LANGUAGE.'test.en.mo', 'en');
echo('<pre>');var_dump($t);echo("</pre><hr/>n");echo($t->_('This is a test.'));?>

The result of the var_dump being:

object(Zend_Translate)#1 (1) {
  ["_adapter:private"]=>
  object(Zend_Translate_Adapter_Gettext)#2 (6) {
    ["_bigEndian:private"]=>
    bool(false)
    ["_file:private"]=>
    resource(21) of type (stream)
    ["_locale:protected"]=>
    string(2) "en"
    ["_languages:protected"]=>
    array(1) {
      ["en"]=>
      string(2) "en"
    }
    ["_options:protected"]=>
    array(1) {
      ["clear"]=>
      bool(false)
   }
    ["_translate:protected"]=>
    array(1) {
      ["en"]=>
      array(2) {
        [""]=>
        string(339) "Project-Id-Version: Test Zend GetText
POT-Creation-Date:
PO-Revision-Date: 2007-06-21 12:20-0000
Last-Translator: THEMike
Language-Team:
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Poedit-Language: English
X-Poedit-Country: UNITED KINGDOM
X-Poedit-SourceCharset: utf-8
"
        ["This is a test."]=>
        string(40) "[Translated]This is a test.[/Translated]"
      }
    }
  }
}

As you can see, before I’ve even called a translate call, the entire mo translation catalogue has been loaded into memory and parsed internally to form a PHP array. Which is then used for translation.

Clearly, this indicates a very modular translation system. I would want a core.lang.mo file for “common” translations used througout the application and then a controller.lang.mo file for each controller that had that controller’s specific phrases in it which is only loaded by that controller.

However, note that the translation is done to a PHP array. Essentially, it seems the gettext translator is really a front-end loader of the array translator. So why not use the array translator?

The only downside I can see is that it’s harder to get non-programmers to generate valid PHP arrays when supplying your translation. Other than that, anything that the PHP extension does to optimise compilation and processing of PHP code will kick in and give you a significantly improved performance. Put extra things on top of that like the Zend Optimisers and so forth, and you have a compelling reason to use highly modular array based translation.

The problem then remains getting valid PHP array files back from your translators, and frankly, that can be solved by writing a simple front end for your translators so they have a GUI to use.

Popularity: 100% [?]

Like Loosing a Limb

Thursday, June 21st, 2007

My internet connection at home is gone. I’m migrating ADSL ISP’s and it appears my old ISP has cut me off and my new ISP is not quite ready for me. “Due to demand, expect to wait three business days beyond your activation date”. Last night I tried to write some code.

Now, coding at home is always painful. At work I have dual monitors and a reasonably powerful box with plenty of RAM (for the coding I do, which is very little and mostly scripting, for real programming that my team do, not really powerful enough. But since I’ve wandered up to manage them, it doesn’t matter that my box sucks).

At home, however, I’ve got a 4 year old laptop that cost £300 when I bought it. With a 15″ screen. At home I’m doing the same kind of scripting, but, it’s just less productive due to raw power of machine and lack of screen real estate.

And now I have no internet.

I can’t remember when exactly I made the transition from referencing books (”In a nutshell” or “Pocket reference” books when I needed something, or checking a “Cookbook” for a recipe, or perhaps even using the MSDN CD) to relying 100% on Google and the wider internet. But it seems I’ve done so. And loosing the internet when I’m coding is like loosing a limb.

It’s easy enough to deal with not having the internet to surf and in fact distract me. But not having it to bail me out when I’m stuck is terrible. Jeff Atwood just said, entirely in passing:

I can barely program these days without an active internet connection; I feel crippled when I’m not networked into the vast hive mind of programming knowledge on the internet.

And he’s right. It’s horrible. His main point was about coding with other people so that you can bounce off each other etc, which is also important. But the lack of the internet as the solution to all your dead-ends is even worse.

Popularity: 32% [?]

Ruby Isn’t It Great!

Monday, June 18th, 2007

I’ve been keeping an eye on all the Ruby and Ruby on Rails hype. Thinking, wow, how exciting, I wish I had time to have a dabble. I really do want to get an environment sorted out and have a play, so I opened this article with great excitement and started to read. The excitement soon waned.

Ruby removes unnecessary cruft: (){};

  • Parenthesis on method calls are optional; use print "hi".
  • Semicolons aren’t needed after each line (crazy, I know).
  • Use “if then else end” rather than braces.
  • Parens aren’t needed around the conditions in if-then statements.
  • Methods automatically return the last line (call return explicitly if needed)

Ruby scraps the annoying, ubiquitous punctuation that distracts from the program logic. Why put parens ((around),(everything))? Again, if you want parens, put ‘em in there. But you’ll take off the training wheels soon enough.

The line noise (er, “punctuation”) we use in C and Java is for the compiler’s benefit, not ours. Be warned: after weeks with Ruby, other languages become a bit painful to read.

I’m sorry, but none of that is rocket science. I can code in a large number of languages, many of which don’t require a semi-colon to terminate a line. Several of which use if then else end instead of braces. Several don’t need parens around if statement clauses and I’m sure at least one or two return the last statement.

One of the languages that springs to mind is BASIC, and it’s many flavours. And yet, hard-core Java/C/C++ nuts switching to Ruby wouldn’t touch something like Basic or VBScript for exactly the same reasons they declare Ruby to be great.

Notice the points about pointless cruft and punctuation? Then notice some of the examples that follow on:

dictionary = { :cat => "Goes meow", :dog => "Barks loud."}

That strikes me as pretty crufty. Other syntax are equally illogical:

x = a || b || c || "default"

Frankly a lot of the reasons I see for Python or Ruby being so productive and great are the fact it’s just damn fast to script. Something those of us who’ve been scripting for years have known for a long, long time. I can knock something up in any of half a dozen scripting languages really fast. And have been able to since ‘96 when I first got into script languages. But, I’ve always been knocked by Java developers who think Java is the one true language. Seems that the Java guys and the other compiled language drones are finally turned on to scripting and blissfully unaware they’ve been slagging it off for years when others of us have been doing it.

Popularity: 63% [?]

Evaluating Platform Choices

Saturday, June 16th, 2007

So I’ve reached a point where I know what the application I will write needs to do. The next logical step is to design that application. However, before I do a design, I need to have a “big picture” idea of the software architecture and I need to know what third party code libraries I can use.

As I noted, choosing a language is choosing a platform:

This was something I was planning to move on to talk about in more detail later. When you pick a language, you do need to look around at the choices of libraries available to you. PHP is lucky. It’s mature. There are a lot of libraries out there.

But, you must be very careful when confronted by such choice. Take a look at the options available for templating. Pear (a major source of libraries) have several implementations. There’s Smarty, and numerous other “just templating” libraries. The new Zend Framework also has a templating implementation.

I need to make a very careful set of choices when choosing what libraries to build on, if you choose the wrong items, and get them fundamentally embedded into your application, at a later date you have a much bigger job to replace them. Possibly a fundamental re-write.

Now, I’m wanting to make use of standard design patterns to make sure my application architecture is maintainable, controllable, extendible and scalable. So I’m going to be following standard Object Oriented software design using standard design patterns. I’m going to utilise Model View Controller (MVC, or in Microsoft Land Model View Presenter), Database Abstraction Layers, Singleton Patterns and other things.

AS noted, PHP is very mature, there are a lot of choices. Some of those choices are more mature than others. Some are big and bloated, some a trim and limited. It’s essential I make the right choices.

I’m going to spend the next few articles examining database abstraction libraries and templating engines. Then move on to look at design pattern libraries to support an MVC implementation. But, first I need a good set of criteria to compare them on. I need to look at the following:

  • Performance
  • API details (will it do what I want flexibly enough, will it lend itself to integrating with other libraries and my own code?)
  • Documentation (will I be able to figure out how to use it?)
  • Is the project alive and moving or is it dead?

Critical to me most of all is performance, I don’t want the PHP engine to have to process thousands of kb of useless code that will never be called by my code, just to get some trivial feature. So I need to find a way to asses this. The first stop on this route is to get some profiling tools installed into my development environment, so that’s what I’ll do next.

Popularity: 51% [?]

Login is a Barrier

Saturday, June 9th, 2007

As I noted in my last post, adverts put people off. As does paying for a service. Jeff Atwood points out that registering for a service also puts people off.

If your application requires users to log in, don’t underestimate the impact of the login barrier you’re presenting to users. Consider utilizing anonymous, cookie-based accounts to give users a complete experience that more closely resembles the experience that named users get. By removing the login barrier and blurring the line between anonymous users and named users, you’re likely to gain a lot more of the latter.

Just asking users to login or register scares them away. And he’s right. When I’m looking at new sites online, I don’t want to sign up to find out if I want to sign up. I want to poke around before I create yet another registration, yet another version of my information. Jeff’s post was very timely, I need to take this into account when building Multiblog.

The barrier to entry needs to be so low anyone can get over it. Or no-one will.

However, given it’s nature this is going to be difficult with Multiblog. I guess it’s possible to create accounts tied to an anonymous cookie and allow them to create blog accounts on that account. But it’s hardly secure. What I need to do is allow anonymous users to wander round everything, toy with configuring accounts, even draft and hit post on entries. Then I’m afraid I’m going to have to stop them and make them sign up. But, this has to be painless.

How do I do this? There are numerous options I could take. One idea is to prompt for an OpenID, the idea behind OpenID is that it’s an Open system for authentication, everyone should support OpenID and then you can allow people to post behind an identity on any site without signing up.

I think that’s great for many applications, such as posting comments on blogs (like this one), but it doesn’t work for services like Multiblog. Even the OpenID site says so:

This is not a trust system. Trust requires identity first.

There has even been spam originating from OpenID sources. So we need something else.

I’m going to have to go with an infrequently used option. I’m going to prompt for an email address. That’s it. This isn’t a site with profiles, so there is no need for usernames or anything else. When you first start using Multiblog I just need a globally unique way of referencing people. And that’s their email address. Couple that with a random password and then you have a full sign-up and authentication system. On entry of the email I’ll send an activation password. On logging in with that password, I’ll set everything up and have the user back at the “are you sure you want to post this entry?” stage. Job done.

As frictionless as it is, it’s a threat. This is predicated on at least one free post to all users. Which is an exploitable hole. People could buy a domain and have an infinite number of email addresses. So, I’ll have to make blog accounts (the accounts Multiblog posts to) be unique, monitor sign-ups (not expecting high volume) and take appropriate other anti-spam/abuse techniques.

But basically, I think it’s a good approach. It will allow users to sign up in seconds and to explore 95% of the application without an account. I hope Jeff likes it more than an vanilla textarea control.

Popularity: 64% [?]