Archive for the 'Development' Category

Software Design Approach

Saturday, April 14th, 2007

So I’ve decided on a platform for development. PHP/Apache/MySQL. Next we need to think a little bit about how we’re going to build this thing.

Right now, downstairs, I have a builder working on a new extension to my living room and a carpenter fitting a new kitchen. I don’t want to trivialise the problems and issues they face, but, the building industry is thousands of years old. Materials and techniques do change. But not that much. Lighter stronger materials. But, a brick is a brick. And it takes a certain amount of time to build a wall out of them. And the calculations required to ensure it meets requirements are well known and understood.

There is always the uncertainty of weather, and suprises like (random attempt to appear to know something about building…) dryrot to deal with. But generally, there aren’t that many suprises and new challenges in building. You do have some aspects of interpretting the customer’s desires, but, blueprints, 3d modelling of what’s being planned in the kitchen sort all that stuff out with a pretty exact level of detail.

Software specification, design and development on the other hand is another thing entirely.

Software is not a science, it’s not even really engineering. Software has been likened to a collaborative sport. I quite like this article I read recently where one parallel drawn is that software development is like rock climbing. This rang very true for me, as I’m a software developer and ex-rock climber.

Techniques and methods are changing all the time, we still haven’t managed to reach the point where we have such a set of tools, techniques and materials that we can get a bunch of people to go through an apprenticship in purely practical things and follow a basic recipie to turn out a repeatable delivery on time of a finished product, like we can with building (troublesome builders who run over, turn up late etc asside).

Software development (commercial, enterprise grade) takes highly skilled and educated people to deliver the mess we currently deliver. We’re far behind the building industry, which is delivered by (sorry, but it is) the less bright, less able, less educated people in society. Becuase they can! Becuase those of us with degrees in software engineering are crap at it and are too busy being crap at writing software. But crap in that certain special way.

Anyway, opinionated rant asside, things are getting better. We’re starting to establish some common, re-useable techniques to get software delivered better. There are a whole bunch of generic approaches to development, each one with a myriad of different interpretations.

I’m going to talk about Object Oriented software development, because, in my opinion that is the best approach.

The basic concepts of Object Oriented software design and development are covered well in a number of places. My general preference as the font of all knowledge is usually Wikipedia, and as it’s a software topic, they give Object Orientation a pretty good coverage.

Basically though, in object oriented software, everything is should be an “object”. And by that we mean the literal term for an object, in the sense of the fact a car is an object. That object should wrap up all the attributes of that object, such as it’s colour, and all the things you can do to it, such as start the engine.

In procedural programming, you would have a whole bunch of unrelated variables, such as $carColour and routines such as function startCarEngine() that lived somewhere in that spagetti of code you call a system. Object Orientation keeps things clean and tidy. You have your Car class, and you have a variable $myCar that represents a specific car. You can find out what colour that car is ($myCar->Colour) and you can start it ($myCar->StartEngine()) without worrying about the mess of variables.

But that’s just the start. What if you have different types of car. Old fashioned cars needed a crank handle to start them. But they still had wheels, tyres and so forth. In the procedural world, we’d have effectively duplicated code for old fashioned car starting and new modern car starting. The difference to the user is turning the key, or cranking the handle. But internally, much of the implementation may be the same (pump fuel, fire spark plugs etc).

That’s where object orientation gets useful. Objects can be built of other objects. This is called inheritance. Basically, you define a basic car class that defines all the features common to all cars. Then from that you inherit all that code, and extend it or replace it to provide an implementation of a type of car. Such as a Crank Handle Car or a Modern Car. From that you can inherit down and down until you have a class that defines a Ford Mondeo 1.8 Deisel, because it has details that are different to the Ford Mondeo 1.8i Petrol.

Indeed, your base car class, could have inherited common properties from the Vehical class. In some implementations of object orientation you can inherit multiple things. So it may have inherited “Internal Combustion”, “Wheeled Transport” and several other base classes.

You code once, and use or extend that code in many other places.

This is important, because it means you fix a root bug once and fix it everywhere.

OO is important for Multiblog, because Multiblog needs to post to many different web services. But, each web service will have certain things in common. At the very least, I’ll need the ability to make an HTTP POST operation to a URL. If I write my blog service implementations as classes, I can derive common functionality from a BlogService base class.

Furthermore, with Object Orientation there is a rich collection of Design Patterns. These are as close as software engineering gets to fixed, repeatable recipies for development. They give us ways to do things.

The typical object oriented patterns make it really simple to layout and engineer your software in such a way it will be easy to change, expand and manage your development. We’ll see that in action later on (I hope!)

Popularity: 26% [?]

Choosing a Platform

Saturday, April 7th, 2007

The first step for me in starting to design and plan for the development of the new Multiblog web product is to decide what platform. To be honest, this is a pretty trivial decision for me. But, lets discuss why I’ve made the choices.

Firstly, what do I mean by platform? Multiblog the first application runs on the “Windows” platform. That’s it’s target platform, but I’m not talking about that kind of platform. I’m talking about a web application, by default the client platform is any browser any operating system. Developers for an application had better have a good reason for narrowing that down, or they are losing customers.

When we talk about the platform for a web application we’re really talking about the server side. What kind of operating system? What Web Server on that operating system? What database server? What language to develop in?

Several things will impact on this decision. What the application needs to do, how it needs to do it, where it needs to do it, when it needs to do it and budget. There are languages with particular strenghs in the what and how market. For example, for a basic data in, data out application, Ruby on Rails would be hard to beat with it’s instant building of simple forms for managing data. For an application targetted at the enterprise market, Ruby’s an unlikely choice due to it’s lack of maturity and commercial support. Java, .NET, SQL Server and Oracle occupy that space.

There’s a lot of impacts. But, I’m going to go with Apache as the host web server, PHP as the development language and MySQL as the database. Why?

Simple. This is a startup niche product. Budget is critical. I can find PHP, Apache and MySQL hosting pretty much anywhere for hadly any money. This is a platform that can scale. I know how to set it up and manage it. If this does take off, I can easily migrate up and up with that base platform.

So the initial cost is zero. Everything is free, and I can develop on my laptop at home. Or any computer anywhere, since my Apache, MySQL and PHP environment can be mounted on a USB key and wacked into any Windows box and worked on with notepad. I’ll be able to find a good cheap host to start with and swiftly migrate and scale up as and when needed.

PHP is a mature web development language, and I’m very familiar with it. So, barriers to getting things done are pretty low. I also want to pick a language that is going to be of most interest as an example. There are lots of hobbiests working in PHP. Lots of professionals too. If this is to be an example of a professional approach to web development, the barrier of entry for the audience should be low. I’d still like to follow up wtih an attempt in Python and one in Ruby for reference and comparisson, but that will have to come much, much later.

Obvious and easy choice made…

However, which version of PHP should I target? PHP4 is ubiquitous. PHP5 provides some very nice features.

Well, for reasons I hope will become clear down the line, as I explain them, I want to write an Object Oriented system. So, given that PHP4’s OO options are limited and frustrating I’m jumping in on the much less supported (in terms of available, cheap, good hosts) PHP5.

Why Apache? Why not IIS? Frankly, PHP is mostly used in Apache. It works best there. it works in other places but not as easily and readily. I like IIS as a platform. I work with it in my day job. But for a PHP project, it’s not a good choice.

Popularity: 19% [?]

Project Kick-Off

Saturday, March 31st, 2007

My initial plan with this site was to create an application multiple times, in phases, in a number of languages to demonstrate to non-professional developers a typical “professional” approach to producing some software. Then, in the process of developing the same application in a number of different languages to show everyone how the languages differ, where they have their strengths over the other options etc.

However, it’s become painfully clear that that’s a pretty big project. Between work and two small kids, plus keeping my wife happy, I don’t have the time to really do that. However, I still want to do something with this site.

Some time back, I wrote an application called Multiblog. I thought it was a pretty nice peice of software. It provided bloggers who have multiple blogs for one reason or another a single application to cross post an entry over all their blogs. I wrote it because nothing else did it, and I needed it. I had my personal site powered by Geeklog, a Modblog with a lot of Modblog only friends and a LiveJournal with a lot of LiveJournal only friends. I wanted to update them all.

I wrote the application in Delphi, as for windows development, that was my preferred tool for GUI development. The software is shareware, it allows you to use it for 30 days before needing registration. Registration only costs $5.

The problem with this was that even with the best of efforts, software registration keys are broken in a tiny amount of time by the vast cracking industry. Plus, it was a Windows only application. Plus, I no longer have access to the Delphi IDE. Basically, this means that even if I had the tools to maintain Multiblog still, it would be a “charitable” effort, creating software that was stolen far, far more than purchased. Eating my time for no return.

So, solution to two problems:

Write a web-based Multiblog. Available within a browser from anywhere, with all the features (and more) of Multiblog the application. This can be a pay service, and I’ll always have the tools to update it. Because I write web software using light weight programmers text editors.

I can then also document/blog/describe the process and development on this site. Two birds, one stone.

So now, I just need to find the time. I intend to write one article a week. Which will force me to keep pace with the software in order to have things to write about. First stage is a few critical decisions on platform.

Popularity: 17% [?]

Passwords - Choosing Good Ones and Keeping Them Safe

Saturday, January 27th, 2007

One thing is a constant in web development: user accounts. Every site with any dynamic aspect needs at least one login, that of the administrator. Choosing good passwords and keeping them safe is a vital skill for any web master. Keeping your users passwords safe is a vital skill for any web master. We’ll split this into two parts; Choosing Passwords and Storing Account Passwords.

Choosing Passwords

Many, many users will set their password to password given half a chance. There are various standards that relate to passwords and how systems should enforce them to address this issue. Rules like “Must contain at least one upper-case character”, “Must contain at least one digit”, “Password must be changed every 90 days”, “Password must not be the same as the previous three passwords” etc. Personally, I’d like to consider this common knowledge and well-trodden ground, but it seems users just don’t get this message and still use insecure passwords all the time.

A web master must be above this, and must secure their administrator accounts with strong passwords. So it was with some dismay that I stumbled into Microsoft’s little online tool to test the strength of your password. The tool goes from Weak to Medium to Strong to Best. There are three distinct rules. Meeting the requirements of each rule moves you one point up the scale. The rules are:

  1. Use at least one upper-case character.
  2. Use at least one non-alphanumeric character (i.e. not a-Z).
  3. Have a length of at least 14 characters.

This means that the password “Password1″ is considered Strong whilst “qsd43ghp” is only Medium. And “Password1aaaaa” is the best of the best. Which worried me a lot, as recently I read an article by Bruce Schneier in which he explains how brute force password crackers rip through passwords, and things like Password1 are right at the top of their cracking attack pattern. Password1 is even the most popular password on MySpace (ref).

Make sure you read that first article, it’s important. I know a lot of geeks, myself included who use some patterns to secure their password that tools such as the Password Recovery Tool Kit fully understand and prioritise their breaking towards. We’re not as safe as we thought.

Bruce advocates two types of password as harder to beat, one will make passwords you can never remember. However, his 2nd type:

a password made up of the first letters of a sentence [is not going to be guessed], especially if you throw numbers and symbols in the mix.

This gives us an “in” to coming up with truly secure passwords that we can remember. We pick a sentence that we can remember, isn’t obvious to people who know us (i.e. spend some time picking a good one) and we work from there. I’m going to use “Mike is the most awesome person in all the world”. First thing we do is just use the initials giving mitmapiatw, which is almost pronounceable. Next, we want to add capital letters. Choosing a rule at random, I’m going to capitalise the small insignificant words mITmapIATw. We now have a Microsoft Approved Medium Strength password. So we need to add numbers and symbols to make it Strong. I’m just going to go ahead and use an arbitrary dialect of Leet to change letters into numbers and characters. I’m not just going to replace, I’m going to follow Leetable characters with the Leet representation. This makes my password longer. mI!Tma4pI!A4Tw. You could change your dialect of Leet for repeated characters, then I wouldn’t have two 4’s and two ! characters. But I won’t bother. That gives me a Strong password. It also hits the magic 14, making this a Best strength password.

Not only do Microsoft think it’s as good as it can possibly get, according to Bruce, the password recovery tool kit is really, really going to struggle on it. It’s not based on words. It won’t be in the dictionary attacks. It won’t be in the Leet variations on dictionary attacks It isn’t based on prefix or suffix extensions. At the end of the day, we have a damn complicated series of characters that we have a nice easy key to remember, I just have to remember how arrogant I am, and my password is there!

Of course, now you all know the algorithm I use to make up passwords. All you need to do is spend some time trying to figure out what kind of sentences I might use and build a special dictionary for me. (Trust me, I’m not that arrogant I would actually use that sentence!) But the size of that dictionary would be insane. How many characters did I go for? Did I really hit 14? Or did I keep it small? Or bigger? What was the source of the sentence? I could have an in-joke only a few people know. A favourite line from one of the three hundred DVDs on my shelves. Producing an appropriate sentence based dictionary to attack my password is not a realistic thing to do.

Especially since I’m thinking hard about attack vectors so would try and find a way to come up with a sentence that no-one would guess. A line from a movie I hate? Who knows.

That leaves Social Engineering to get my password, or clues to it. A bar of chocolate might be all you need.

Storing Account Passwords

So, you have users. They set their account password. Chances are it’s weak. Some passwords might even be strong passwords. Chances are, they use the same password loads of places. Perhaps including their on-line banking. You must not be the weak link in the chain. You must not be the service that gets hacked or accidentally leaks data and gives away thousands of your users passwords. Which can happen even with the biggest players.

This means safely storing passwords in the database in a safe way. The safest way to do so, and the standard web approach that all sites should take is to store a cryptographic hash of the data. A finger print. From this finger print, you can not retrieve the original data. However, you can compute the finger print of an entered password and compare it to the stored finger print.

This means you don’t have the password to leak. Except during login. It also means that you have a fixed data size for storing the finger print. Cryptographic Hashes always produce a fixed amount of data.

The defacto standard for this in web applications is the MD5 Checksum. Implementations abound in most languages, many including PHP, have a built in implementation. This generates 32 characters of hex. However, recently weaknesses in this have concerned some developers. Personally, I think the weaknesses are irrelevant to passwords, focusing on the ability to produce a large document that matches a checksum. There is much less room for collision in the password space.

These days, the seriously security concious go for SHA-256. This generates 64 characters of hex.

For comparison, and ease of re-setting my password in development environments, I produced a Google Gadget to generate MD5, SHA-1 and SHA-256 hashes. You can find it here, and add it to your Google Homepage by clicking Add to Google.

A Further Note on Microsoft’s Password Strength Checker

I’d just like to make it clear that I understand that Microsoft’s on-line password strength checking tool is a simple little thing just making sure that the password matches some basic rules. I’m aware of the limitations that that tool has as a result. It can’t really check for obvious things like, your password shouldn’t be your first child’s birthday, or your wedding anniversary (especially if you’re a man as you’re bound to forget it!). But, this is not made clear to the kind of user this tool is targeted at.

EdenToby01072000 would be a “Best” strength password, but so easy to guess if you knew my anniversary and children’s names. Any decent planned assault on my accounts would use those items as dictionary inputs and prioritise them. If you are going to produce a tool that is designed to educate basic end users about password strength, that tool needs to be as good as the passwords need to be.

And Microsoft’s tool isn’t.

Popularity: 31% [?]

Wordpress vs Geeklog

Sunday, January 7th, 2007

So, today I’ve finally finished installing and configuring a Wordpress installation for running InAnger.com. This may seem a fairly odd decision to certain people who know certain things. The main reason it’s odd is that I am a member of the core development team for the Geeklog system. Wordpress is a PHP application using a MySQL database for running a Blog. So is Geeklog. So then, if I am a Geeklog developer, why have I chosen a competing system to run my new site? Interesting question, and one I shall discuss in length.

Firstly, I consider it important to investigate the innovation going on in competing products. I take a similar approach with software languages and tools. Something else is out there and has some buzz, so I play with it to find out why it has that buzz and what that new thing can do for me. With software languages and tools this is likely to be a switch to develop with a new language, even if it’s just for certain tasks that that language is particularly handy for. With a piece of software like a Media Player it’s a possible full switch to that product if it’s better.

With a CMS however, it’s more likely to be that I experiment with it and frankly steal the neat features and implement them in Geeklog. So, I’ve often seen posts “I tried geeklog, but now I’ve switched to wordpress and like it more”. Time to find out why and see if I can address those issues.

That’s not all though. I’m a terrible one for tinkering. I can’t find-out that some piece of software I use is customisable or configurable without becoming dissatisfied with what it can do, and try and customise it. As a Geeklog core developer, if I was using Geeklog, I’d get constantly side-tracked customising things and re-writing things. With Wordpress being a stable application with an active development community which I am not a member of I’m more likely to be able to keep my hands off, and concentrate on using the application.

So, Wordpress vs Geeklog, what’s the difference?

Wordpress

Wordpress is a highly popular Open Source blogging tool. It’s focussed on “aesthetics, web standards and usability”. It has a simple elegant usable interface.

Geeklog

Geeklog is also an Open Source blogging tool. It’s focussed on security. Out of the box it provides a full portal implementation as well as the core blogging features.

Installation

Installation of both is reasonably simple. In both cases you download a tarball archive of the latest version. In both cases you extract this and create an empty MySQL database to install into (in Geeklog’s case, you may use a Microsoft SQL Server database too). In both cases you set up database connection information in a configuration file. In the case of Geeklog all your site’s basic configuration is also in the configuration file and may be edited at the same time, plus, there is the additional need to configure file paths in two files (the configuration file and lib-common.php, the core functionality file). In Wordpress’ case you have to copy the config file to a new name.

Then you navigate to the install script location for either system and execute the install process. Voila, two complete blog package installs. You’re then left to configure up your system to meet your needs.

Publishing Blog Entries

The main thing you’re going to be concerned with day to day with your blog engine is writing, editing, posting and maintaining entries.

Both Geeklog and Wordpress have similar options for your posts. Advanced HTML editors (though I feel the use of FCKEditor gives Geeklog the edge here), control over the way the link is generated (post slug in Wordpress and Story ID in Geeklog, they form part of the URL for Search Engine Friendly URLs), control over user feedback permitted (comments etc), the ability to place an article into a category (and here Wordpress wins with multiple category support), image upload etc.
So where are the differences? Well, Wordpress allows Custom Fields on stories. I have yet to fully explore these, but they allow plugins to provide rich functionality easily to stories. Geeklog has Autotags which replace this functionality.

Geeklog has a strong security model, you can have many user groups and can set permissions for stories and topics in a very controlled fashion. Wordpress lacks any of this subtle security, it’s focussed on public blogging.

Geeklog also has the ability to auto-archive stories after a certain amount of time.

Really at the story level, the only differentiation seems to be that Geeklog has an excellent security system and also allows users to contribute stories (via moderation if necessary) to the site.

Both have strong API’s for plugins to extend things, however, I think extending the posting engine in Wordpress is a lot easier than in Geeklog.

So what about at the display end of stories? Both support comments, trackbacks and pingbacks with anti-spam measures and control over who can comment etc. Both have ways the display of a story can be tailored to suit the webmaster’s needs. It’s pretty even , with minor differences.

That leaves the user experience. Geeklog is (out of the box) a richer and more complex application than Wordpress. It does more. And thus, it’s a bit harder to learn your way around at first. That doesn’t mean Wordpress is a saint. I think I found my way round it reasonably quickly, from scratch, but, it could have been a clearer process. That said, Wordpress’ documentation which is linked at the foot of every admin screen is rich and detailed and vastly superior to the limited (by comparison) documentation for Geeklog.

Other Features

Wordpress is a blogging engine. That’s about it. It gives you static pages and links. But out of the box, that’s pretty much your lot. Geeklog is more of a portal, it comes with a calendar (for events, not to be confused with a monthly post view calendar which both systems have), polls system, links directory and static pages. Geeklog has a modular system for blocks on both the left and right hand side of the page, which allows the administrator to configure up static text, import RSS feeds, or use various built in functions to produce various types of dynamic boxes. To do this in Wordpress, you must edit the theme to include the HTML and/or function calls.

Themeing/Skinning

Both Geeklog and Wordpress come with perfectly nice default themes. But you don’t want your site to look like Just Another Geeklog/Wordpress Site. You want it to look individual and have an identity of it’s own. Both systems therefore offer a theming system to change the look and feel of your site.

Wordpress has a small number of PHP files that make up the base theme. These contain PHP function calls to output data and even loop structures to iterate over posts. This is quite confusing for non-programmers to get to grips with, but there is a lot of documentation, many community developed themes and lots of community support for this work. Plus, there aren’t that many files to edit.

Geeklog has many THTML files that make up the base theme. These contain HTML and place holders for dynamic data (in curly braces, for example {story_title}). It takes a fair bit of work to get your head around how these files are used to actually construct the layout of your site, and the documentation isn’t fantastic on these. But, I feel the content of each template is less confusing.

Extending

Both systems support plugins to provide the functionality you want. Some of Geeklog’s out of the box functionality is from core plugins (calendar, links, polls) and can be uninstalled if you do not like it.

The Geeklog API consists of a lot of functions you can define that get called at appropriate points. The Wordpress API consists of a lot of places you can tell Wordpress that your plugin wants a function call inserting into the chain. Both API’s seem to be pretty mature and well thought out and provide a really great way of extending the application.

If anything, I would suggest the opinion that Geeklog’s API is more mature and powerful, implementations include various image galleries (including integrations of popular Open Source gallery applications like Gallery), forums, file repositories and even further anti-spam measures. Wordpress however seems to have many more plugins available, more focussed on adding features to the story engine.

Summary

So, to just recap. Geeklog is a bigger system. It does a lot more. But, for people who just want to set up a simple blog, that might be too much, too daunting, too confusing. They’re targeted at different audiences. Geeklog is for running sites that are more than just a blog. Wordpress is for sites that are just a blog. The reason people are switching, is because all that power and all those features get in the way of having a plain, simple blog that everyone can read.

Popularity: 82% [?]