Archive for the 'article' Category

Getting customer feedback

Lack of feedback is one of the most difficult things about caring for a small child. You know they are unhappy because they are crying. But you don’t know if that unhappiness is due to: hunger, thirst, too hot, too cold, ear ache, stomach ache, wind, tiredness, boredom, teething or something else. They can’t tell you, so you can only guess. Creating software without feedback is tough for the same reasons. You know how well or badly you are doing by the number of sales, but without detailed feedback from your customers and prospective customers, it is difficult to know how you could do better.

The importance of feedback is amply illustrated by many of the stories of successful companies in the excellent book “Founders at work” by Jessica Livingston. For example, PayPal started out trying to sell a crypto library for the PalmPilot. They went through at least 5 changes of direction until they realised that what the market really wanted was a way to make payments via the web.

So good feedback is essential to creating successful software. But how do you get the feedback?

Face-to-face meetings

Meeting your customers face-to-face can give you some detailed feedback. But is time consuming and doesn’t scale when you have hundreds or thousands of customers. You can meet a lot of customers at a exhibitions, but it hardly an ideal venue for any sort of in-depth interaction. Also, they may be too polite to tell you what they really think to your face.

Technical support

Technical support emails and phone calls are a gold-mine of information on how you can improve your product. If one customer has a particular problem, then they might be having a bad day. But if two or more customers have the same problem, then it is time to start thinking about how you can engineer out the problem. This will both improve the utility of your product and reduce your support burden.

In order to take advantage of this feedback the people taking the support calls need to be as close to the developers as possible. Ideally they should be the same people. Even if you have separate support and development staff you should seriously think about rotating developers through support to give them some appreciation of the issues real users have with their creation. Outsourcing your support to another company/country threatens to completely sever this feedback.

Monitoring forums and blogs

Your customers are probably polite when they think you are listening. To find out what they really think it can be useful to monitor blogs and relevant forums. Regularly monitoring more than one or two forums is very time-consuming, but you can use Google alerts to receive an alert email whenever a certain phrase (e.g. your product name) appears on a new web page. This feedback can be valuable, but it is likely to be too patchy to rely on.

Usability testing

A usability test is where you watch a user using your software for the first time. You instruct them to perform various typical tasks and watch to see any issues that occur. They will usually be asked to say out loud about what they are thinking to help give you more insight. There really isn’t much more to it than that. If you are being fancy you can video it for further analysis.

Usability tests can be incredibly useful, but it isn’t always easy to find willing ‘virgins’ with a similar background to your prospective users. Also the feedback from usability tests is likely to be mainly related to usability issues, it is unlikely to tell you if your product is missing important features or whether your price is right.

Uninstall surveys

It is relatively easy to pop-up a feedback form in a browser when a user uninstalls your software. I tried this, but got very few responses. If they aren’t interested enough in your software to buy it, they probably aren’t interested enough to take the time to tell you why. Those that I did get were usually along the lines “make it free”[1].

Post purchase surveys

I email all my customers approximately 7 days after their purchase to ask whether there is anything they would like me to add/improve/fix in the next version of the software. The key points about this email are:

  • I give them enough time to to use the software before I email them.
  • I increase the likelihood of getting an answer by keeping it short.
  • I make the question as open as possible. This results in much more useful information than, say, asking them to rate the software on a one to ten scale.
  • I deliberately frame the question in such a way that the customer can make negative comments without feeling rude.

The responses fall into five categories[2]:

  1. No response (approx 80%). They didn’t respond when given the opportunity, so I guess they must be reasonably happy.
  2. Your software is great (approx 10%). This really brightens up my day. I email them back to ask for permission to use their comment as a testimonial. Most people are only too happy to oblige.
  3. Your software is pretty good but it doesn’t do X (approx 10%). Many times my software actually does do X - I tell them how and they go from being a satisfied customer to a very happy customer. Also it gives me a pointer that I need to make it clearer how to do X in the next version. If my software doesn’t do X, then I have some useful feedback for a new feature.
  4. Your software sucks, I want my money back (rare). Thankfully I get very few of these, but you can’t please all of the people all of the time. Sometimes it is possible to address their problem and turn them from passionately negative to passionately positive. If not, I refund them after I get some detailed feedback about why it didn’t work for them[3].
  5. Stop spamming me (very rare). From memory this has happened once.

I consider them all positive outcomes, except for the last one. Even if I have to make a refund, I get some useful feedback. Anyway, if you didn’t find my software useful, I don’t really want your money.

Being pro-active like this does increase the number of support emails in the short-term. But it also gives you the feedback you need to improve your usability, which reduces the number of support emails in the longer term. I think the increased customer satisfaction is well worth the additional effort. Happy customers are the best possible form of marketing. Post-purchase emails are such a great way to get feedback, I don’t know why more people don’t use them. Try it.

If you make it clear that you are interested in what your customers have to say they will take more time to talk to you. If you act on this feedback it will improve your product (some of the best features in my software has come from customer suggestions). A better product means more customers. More customers means more feedback. It is a virtuous cycle.

All you have to do is ask.

[1] Only if you pay my mortgage. Hippy.

[2] The percentages are guesstimates. I haven’t counted them.

[3] My refund policy specifies that the customer has to say what they didn’t like about the software before I will issue a refund.

Selling your software in retail stores (all that glitters is not gold)

Selling your software in retail storesDevelopers often ask in forums how they can get their software into retail. I think a more relevant question is - would you want to? Seeing your software for sale on the shelves of your local store must be a great ego boost. But the realities of selling your software through retail are very different to selling online. In the early days of Perfect Table Plan I talked to some department stores and a publisher about selling through retail. I was quite shocked by how low the margins were, especially compared with the huge margin for online sales. I didn’t think I was going to make enough money to even cover a decent level of support. So I walked away at an early stage of negotiations.

The more I have found out about retail since, the worse it sounds. Running a chain of shops is an expensive business and they are going to want take a very large slice of your cake. The various middlemen are also going to take big slices. Because they can. By the time they have all had their slices there won’t be much left of your original cake. That may be OK if the cake (sales volume) is large enough. But it is certainly not something to enter into lightly. Obviously some companies make very good money selling through retail, but I think these are mostly large companies with large budgets and high volume products. Retail is a lot less attractive for small independents and microISVs such as myself.

But software retail isn’t an area I claim to be knowledgeable about. I just know enough to know that it isn’t for me, at least not for the foreseeable future (never say never). So when I spotted a great post on the ASP forums about selling through retail, I asked the author, Al Harberg, if I could republish it here. I thought it was too useful to be hidden away on a private forum. He graciously agreed. If you decide to pursue retail I hope it will help you to go into it with your eyes open. Over to Al.

In the 24 years that I’ve been writing press releases and sending them to the editors, more than 90 percent of my customers have been offering software applications on a try-before-you-buy basis. In addition, quite a few of them have ventured into the traditional retail distribution channel, boxed their software, and offered it for sale in stores. This is a summary of their retail store experiences.

While the numbers vary greatly, a software arrangement would have revenues split roughly:

  • Retail store - 50 percent
  • Distributor - 10 percent
  • Publisher - 30 to 35 percent
  • Developer - 5 to 10 percent

Retail stores don’t buy software from developers or from publishers. They only buy from distributors.

The developer would be paid by the publisher. In the developer’s contract, the developer’s percentage would be stated as a percentage of the price that the publisher sells the software to the distributor, and not as a percentage of the retail store’s price.

The publishers take most of the risks. They pay the $30,000(US) or so that it currently takes to get a product into the channel. This includes the price of printing and boxing the product, and the price of launching an initial marketing campaign that would convince the other parties that you’re serious about selling your app.

If your software doesn’t sell, the retail stores ship the boxes back to the distributor. The distributor will try to move the boxes to other dealers or value-added resellers (VARs). But if they can’t sell the product, the distributors ship the cartons back to the publisher.

While stores and distributors place their time at risk, they never risk many of their dollars. They don’t pay the publisher a penny until the software is sold to consumers (and, depending upon the stores’ return policies, until the product is permanently sold to consumers - you don’t make any money on software that is returned to the store, even though the box has been opened, and is not in good enough condition to sell again).

The developer gets paid two or three months after the consumer makes the retail purchase. Sometimes longer. Sometimes never. If you’re dealing with a reputable publisher, and they’re dealing with a major distributor, you’ll probably be treated fairly. But most boilerplate contracts have “after expenses” clauses that protect the other guys. You need to hire an attorney to negotiate the contract, or you’re not going to be happy with the results. And your contract should include an up-front payment that covers the publisher’s projection of several months’ income, because this up-front payment might well be the only money that you’re going to ever see from this arrangement.

Retail stores’ greatest asset is their shelf space. They won’t stock a product unless there is demand for it. You can tell them the most convincing story in the world about how your software will set a new paradigm, and be a runaway bestseller. But if the store doesn’t have customers asking for the app, they’re not going to clutter their most precious asset with an unknown program.

It’s a tough market. It’s all about sales. And if there is no demand for your software, you’re not going to get either a distributor or a store interested in stocking your application. These folks are not interested in theoretical demand. They’re interested in the number of people who come into a retail store and ask for the product.

To convince these folks that you’re serious, the software publisher has to show a potential distributor that they have a significant advertising campaign in place that will attract prospects and create demand, and that they have a press release campaign planned that will generate buzz in the computer press.

Many small software developers have found that the retail experience didn’t work for them. They’re back to selling exclusively online. Some have contracted with publishers who sell software primarily or exclusively online. Despite all of the uncertainties of selling software online, wrestling with the retail channel has even more unknowns.

Al Harberg

Al Harberg has been helping software developers write press releases and send them to the editors since 1984. You can visit his website at www.dpdirectory.com.

Functional programming - coming to a compiler near you soon?

We can classify programming languages into a simple taxonomy:

Commercial programmers have overwhelmingly developed software using imperative languages, with a strong shift from procedural languages to object oriented languages over time. While declarative style programming has had some successes (most notably SQL), functional programming (FP) has been traditionally seen as a play-thing for academics.

FP is defined in Wikipedia as:

A programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data.

Whereas an imperative language allows you to specify a sequence of actions (’do this, do that’), a functional language is written in terms of functions that transform data from one form to another. There is no explicit flow of control in a functional language.

In an imperative language variables generally refer to an address in memory, the contents of which can change (i.e. is ‘mutable’). For example the rather unmathematical looking “x=x+1″ is a valid expression. In FP there are no mutable variables and no state.

In an imperative language a function can return different values for the same input, either because of stored state (e.g. global or static variables) or because it is interfacing with an external device (e.g. a file, database, network or system clock). But a pure functional language always returns the same value from a function given the same input. This ‘referential integrity’ means an FP function call has no ’side-effects’ and consequently can’t interface with external devices. In other words it can’t actually do anything useful - it can’t even display the result of a computation to your VDU. The standard joke is that you only know a pure functional program is running because your CPU gets warmer.

The functional language Haskell works around the side-effects issue by allowing some functions to access external devices in a controlled way through ‘monads’. These ‘impure’ functions can call ‘pure’ functions, but can never be called by them. This clearly separates out the pure parts of the program (without side-effects) from the impure ones (with side-effects). This means that it is possible to get many of the advantages of FP and still perform useful tasks.

FP is much closer to mathematics than imperative programming. This means that some types of problems (particularly algorithmic ones) can be expressed much more elegantly and easily as functional programs. The fact that a function has no side effects also means that it’s structure is much easier to analyse automatically. Consequently there is greater potential for a computer to optimise a functional program than an imperative program. For example in FP:

y = f(x) + f(x);

Can always be rewritten as:

z = f(x);

y = 2 * z;

Saving a function call. This is more difficult to do in an imperative language, because you need to show that second call to f(x) won’t return a different value to the first.

Functional programs are also inherently much easier to parallelise, due to the lack of side-effects. We can let the FP interpreter/compiler take care of parallelism. No need to worry about threads, locks, critical sections, mutexes and deadlocks. This could be very useful as processors get ever more cores. However imperative languages, with their flow of control and mutable variables, map more easily than functional languages onto the machine instruction of current (von Neumann architecture) computer. Consequently writing efficient FP interpreters and compilers is hard and still a work in progress.

Elements of FP are steadily making their way into mainstream commercial software:

  • Erlang is being used in commercial systems, including telecoms switching systems.
  • Microsoft Research has implemented F#, a .Net language that includes FP elements based on ML.
  • Work is underway to add elements of FP to version 2.0 of the D programming language.
  • Google’s MapReduce is based on ideas from FP.
  • The Mathematica programming language has support for FP.
  • The K programming language is used in financial applications.
  • The Perl 6 compiler is being written in Haskell. <insert your own sarcastic comment here>.

I recently attended ACCU 2008 which had a whole stream of talks on FP. All the FP talks I attended were packed out. That is quite something given that the audience is primarily hardcore C++ programmers. There seemed to be quite a consensus in these talks that:

  • FP is starting to move out of academia and into commercial use.
  • FP is more suitable than imperative style programming for some classes of problem.
  • FP is not going to replace imperative programming. The bulk of commercial development will still be done in an imperative style, but with FP mixed in where appropriate.
  • Hybrid languages that mix OO and FP will become more common.

I don’t see Haskell replacing C++ any time soon. But I can definitely see the benefits of using FP to tackle some types of problems.

Further reading:

The Functional programming reference in Wikipedia

This article is based loosely on notes I made at ACCU 2008 from attending the following talks:

  • “Caging the Effects Monster: the next decade’s big challenge”, Simon Peyton-Jones
  • “Functional Programming Matters”, Russel Winder
  • “Grafting Functional Support on Top of an Imperative Language”, Andrei Alexandrescu

Any mistakes are almost certainly mine.

Choosing a development ’stack’ for Windows desktop applications

beauty_parade.jpgI have have heard plenty of people saying that desktop software is dead and that all future development will be done for the web. From my perspective, as both a buyer and seller of software, I think they are wrong. In fact, of the thousands of pounds I have spent on software in the last three years, I would guess that well over 90% of it was spent on software that runs outside the browser. The capabilities of web based applications have improved a lot in recent years, but they still have a long way to go to match a custom built native application once you move beyond CRUD applications. I don’t expect to be running Visual Studio, PhotoShop or VMWare (amongst others) inside the browser any time soon. The only way I see web apps approaching the flexibility and performance of desktop apps is for the browser to become as complicated as an OS, negating the key reason for having a browser in the first place. To me it seems more likely that desktop apps will embed a browser and use more and more web protocols, resulting in hybrid native+web apps that offer the best of both worlds.

So, if Windows desktop apps aren’t going away any time soon, what language/libraries/tools should we use to develop them? It is clear that Microsoft would like us to use a .Net development environment, such as C#. But I question the wisdom of anyone selling downloadable off-the-shelf software based on .Net [1]. The penetration of .Net is less than impressive, especially for the more recent versions. From stats published by SteG on a recent BOS post (only IE users counted):

No .Net: 28.12%
>= .Net 1.0: 71.88%
>= .Net 1.1: 69.29%
>= .Net 2.0: 46.07%
>= .Net 3.0: 18.66%
>= .Net 3.5: 0.99%

Consequently deploying your app may require a framework update. The new .Net 3.5 framework comes with a 2.7 MB installer, but this is only a stub that downloads the frameworks required. The full set of frameworks weighs in at eye watering 197 MB. To find out how much the stub really downloads Giorgio installed .Net 3.5 onto a Windows 2003 VM with only .Net 1.0 & 1.1. The result: 67 MB. That is still a large download for most people, especially if your .Net 3.5 software is only a small utility. It is out of the question if you don’t have broadband. Microsoft no doubt justify this by saying that the majority of PCs will have .Net 3.5 pre-installed by the year X. Unfortunately by the year X Microsoft will probably be pushing .Net 5.5 and I dread to think how big that will be.

I have heard a lot of people touting the productivity benefits of C# and .Net, but the huge framework downloads can only be a major hurdle for customers, especially for B2C apps. You also have issues protecting your byte code from prying eyes, and you can pretty much forget cross-platform development. So I think I will stick to writing native apps in C++ for Windows for the foreseeable future.

There is no clear leader amongst the development ’stacks’ (languages+libraries+tools) for native Win32 development at present. Those that spring to mind include:

  • Delphi - Lots of devoted fans, but will CodeGear even be here tomorrow?
  • VB6 - Abandoned and unloved by Microsoft.
  • Java - You have to have a Java Run Time installed, and questions still remain about the native look and feel of Java GUIs.
  • C++/MFC - Ugly ugly ugly. There is also the worry that it will be ‘deprecated’ by Microsoft.
  • C++/Qt - My personal favourite, but expensive and C++ is hardly an easy-to-use language. The future of Qt is also less certain after the Nokia acquisition.

Plus some others I know even less about, including: RealBasic and C++/WxWidgets. They all have their down sides. It is a tough choice. Perhaps that is why some Windows developers are defecting to Mac, where there is really only one game in town (Objective-C/Cocoa).

I don’t even claim that the opinions I express here are accurate or up-to-date. How could they be? If I kept up-to-date on all the leading Win32 development stacks I wouldn’t have any time left to write software. Of the stacks listed I have only used C++/MFC and C++/Qt in anger and my MFC experience (shudder) was quite a few years ago.

Given that one person can’t realistically hope to evaluate all the alternatives in any depth, we have to rely on our particular requirements (do we need to support cross platform?), hearsay, prejudice and which language we are most familiar with to narrow it down to a realistic number to evaluate. Two perhaps. And once we have chosen a stack and become familiar with it we are going to be loathe to start anew with another stack. Certainly it would take a lot for me to move away from C++/Qt, in which I have a huge amount of time invested, to a completely new stack.

Which Windows development stack are you using? Why? Have I maligned it unfairly above?

[1] Bespoke software is a different story. If you have limited deployment of the software and can dictate the end-user environment then the big download is much less of an issue.

Your harddrive *will* fail - it’s just a question of when

failed harddisksThere are a few certainties in life: death, taxes and harddisk failure. I have no less than 6 failed harddisks sitting here on my desk patiently awaiting their appointment with Mr Lump Hammer. 2 Seagates, 3 Maxtors and 1 Western Digital. This equates to roughly one disk failure per year. Perhaps this is not suprising given that I have about 9 working harddisks at the moment spread across various machines. Given the incredible tolerances to which harddisks are manfactured, perhaps it is a miracle harddisks work at all.

As an analogy, a magnetic head slider flying over a disk surface with a flying height of 25 nm with a relative speed of 20 meters/second is equivalent to an aircraft flying at a physical spacing of 0.2 µm at 900 kilometers/hour. This is what a disk drive experiences during its operation. -Magnetic Storage Systems Beyond 2000, George C. Hadjipanayis from Wikipedia

We all know we need to back-up our data. But it is a chore that often gets forgotten at the most critical periods. Here are my hints for preparing yourself for that inevitable ‘click of death’.

  • Buy an external USB/Firewire harddrive. 500GB drives are ridiculously cheap these days. Personally I don’t like back-up tapes due to experiences of them stretching and corrupting data.
  • Back-up images of the entire OS, not just the data. You can use Acronis TrueImage on Windows and SuperDuper on MacOSX. This can save you days restoring your entire development environment and applications from scratch.
  • Back-up individual files as well as entire OS images. You don’t want to have to restore a whole image to retrieve one critical file. Windows Vista and Mac OS X Leopard both have back-up applications built into the OS.
  • Use a separate machine to your development machine as source code server.
  • Use a RAID-1 (mirrored) disk on your main development machine[1]. It is worth noting that this actually doubles the likelihood of harddisk failure, but makes the likelihood of a catastrophic failure much lower. Keep an identical 3rd drive on hand to swap in when a drive fails.
  • Back-ups aren’t much use if they get incinerated along with your office in a fire, so store copies off-site. For example you can:
  • Make sure any off-site copies are securely encypted, for example using Axcrypt.
  • Automate your back-ups as far as possible. Computers are much better at the dull repetitive stuff.
  • Test restoring data once in a while. There is not much point backing up data only to find you can’t restore it when needed.

There are lots of applications for backing up individual files. So many in fact, that no-one has any hope of evaluating them all (marketing tip: don’t write another back-up application - really). I also worry that data stored in their various proprietary formats might not be accessible in future due to the vendor going out of business. I find the venerable DOS xcopy adequate for my needs. I run it in a scheduled Windows batch file to automatically synch file changes on to my usb harddrive (i :) every night. Here it is in all its glory:

XCOPY c:\data i:\data /d /i /s /v /f /y /g /EXCLUDE:exclude.txt

The exclude.txt file is used to exclude subversion folders and intermediate compiler files:

\.svn\
.obj
.ilk
.ncb
.pdb
.bak

Which of the above do I do? Pretty much all of them actually. At least I try, I haven’t yet automated the offsite backup. This may seem rather excessive, but it paid dividends last month when gremlins went on the rampage here in the Oryx Digital office. I had 2 harddrive failures in 2 weeks. The power supply+harddisk+network card on my old XP development machine failed then, while I was in the process of moving everything to my new Vista development machine, one of the RAID-1 disks on the new machine failed.

Things didn’t go quite according to plan though. The new RAID-1 box wouldn’t boot from either harddisk. I have no idea why.

raid1

Also the last couple of weekly Acronis image back-ups had failed and I hadn’t done anything about it. I had recent back-ups of all the important data, but I faced a day or more reinstalling all the apps I had installed since the last successful image. It took several hours on the phone to Dell technical support and much crawling around on the floor before I could I get the new RAID-1 box to boot off one harddisk. I was then able to rebuild RAID-1 using the spare harddisk I had on standby for such an eventuality. Nothing was lost, apart from my sense of humour.

Dell offered to replace the defective harddisk under warranty, but I declined on the grounds that there is far too much valuable information on this disk (source code, digital certificate keys, customer details etc) for me to entrust it to any third party. Especially given that Dell reserve the right to refurbish the harddisk and send it to someone else. What if they forgot to wipe it? My experiences with courier companies also haven’t given me great confidence that the disk would reach Dell. And I didn’t want to receive a reburbished disk as a replacement. It just isn’t worth relying on a refurb given how cheap new harddisks are. So the harddisk has joined the back of the growing queue to see Mr Lump Hammer.

The availability of cheap harddisks and cheap bandwidth means that it has never been easier to backup your systems. No more fiddling with mag tapes. Of course it is possible that your harddisk will work perfectly until it becomes obselete, but I think it would be very unwise to assume that this will be the case. Don’t say I didn’t warn you…

Further reading:

What’s your backup strategy? (the prolific and always worth reading Jeff Atwood beats me to the punch)

[1] RAID-1 is built in to some Intel motherboards and is available as a relatively inexpensive extra from Dell. You may have to ask for it though - it wasn’t listed as a standard configuration option when I purchased my Dell Dimension 9200.

[2] Since I wrote this article I installed the latest version of JungleDisk on my Vista box. On the 3 occasions I have tried to use it it hung Vista to the point where I had to I had to cut the power in order to reboot. I have now uninstalled it. 

 

Seeing your software through your customers’ eyes

usabilityWe all like to think that our software is easy to use. But is it really? How do you know? Have you ever watched anyone use it? When I asked this questions to a room full of developers last year I was surprised at how many hadn’t.

Other people don’t see the world the way you do. Their weltanschauung (view on the world) is influenced by their culture, education, expectations, age, gender and many other factors. Below is a copy of a card I received for my birthday a few weeks ago (click for a larger image) which I think illustrates the gulf between how developers and their customers see the world rather well.

birthday_card.jpg

If your customers are also developers the difference in backgrounds may not be so large. But the difference in how they see your software and how you see it is still huge. You have been working on your software for months or years. You know everything worth knowing about it down to the last checkbox and command line argument. But your potential customer is probably going to download it and play with it for just a few minutes, or a few hours if you are lucky, before they decide if it is the right tool for the job. If they aren’t convinced, your competitors are only a few clicks away. To maximise your chances of making a sale you need to see your software afresh through your customer’s eyes. You can get some useful feedback from support emails, but the best way to improve the ease of use of your software is to watch other people using it. This is usually known as usability testing.

The basic idea of usability testing is that you take someone with a similar background to your target audience, who hasn’t seen your software before and ask them to perform a typical series of tasks. Ideally they should try to speak out loud what they are thinking to give you more insight into their thought processes. You then watch what they do. Critically, you do not assist them, no matter how irresistible the urge. The results can be quite surprising and highly revealing. Usability testing can be very fancy with one way mirrors, video cameras etc, but that really isn’t necessary to get most of the benefits. There is a good description of how to carry out usability tests in Krug’s excellent book Don’t make me think: a common sense guide to web usability. Most of his advice is equally applicable to testing desktop applications.

The main problems with usability testing are logistical. You need to find the right test subjects and arrange the time and location for testing. You also need to decide how you are going to induce them to give up an hour of their time. Worst of all, once you have used someone they are ‘tainted’ and can’t be used again (except perhaps to test changes in the new versions). It’s a hassle. Or at least it was. Much of this hassle is now taken care of for you by new web-based service www.usertesting.com .

The idea behind usertesting.com is very simple. You buy a number of tests for your website and specify your website url, the tasks you want carried out and the demographics (e.g. preferred age, gender and expertise of testers). Testers are then selected for you and carry out the testing. Once tests have been completed a flash audio+video recording of the session and a brief written report is uploaded for you. Finally you rate the testers on a 5-star scale. Presumably testers who score well will get more work in future. Ideally you should re-run your usability testing after any changes to verify that they are an improvement. I don’t know if usertesting.com allows for the fact that you probably won’t want the same tester a second time for the same project.

I paid $57 for 3 tests on perfecttableplan.com. I was happy with the tests, which pointed out a number of areas I can improve on. There was a problem which meant one of the tests still hadn’t been completed 4 days later. I emailed support and they sorted this out in a timely fashion. It is a new service and they are still ironing out a few glitches. Given the low costs and the 30 day money back guarantee I think it is definitely worth a try. It won’t take many extra conversions to repay your investment. usertesting.com is probably more useful to those of us selling to the wider consumer market. If you are selling to specialised niches (e.g. developers, actuaries, llama breeders) they might have difficulty finding suitable testers.

Unfortunately usertesting.com is currently only available for website usability testing. When I emailed them to suggest they extend the service to desktop apps they told me that it this might be a possibility if there was sufficient interest. I will be first in-line if such a service becomes available. Until then I am left with the hassle of organising my own usability tests. It occurs to me that I could do this remotely using a service such as copilot.com (now free at weekends)+Skype. This might be a good workaround for the fact that my office isn’t really big enough for two people (especially if they don’t know me very well!). It would also allow me to do testing with customers outside the UK, e.g. professional wedding planners in the USA. If I do try this I will report back on how I get on.

Google Adwords: improving your ads

google adwordsOne of the keys to success in Google Adwords (and other pay per click services) is to write good ad copy. This isn’t easy as the ads have a very restrictive format, reminiscent of a haiku:

  • 25 character title
  • 2×35 character description lines
  • 35 character display url

Whats more, there are all sorts of rules about punctuation, capitalisation, trademarks etc. You will soon find out about these when you write ads. Most transgressions are flagged immediately by Google algorithms, others are picked up within a few days by Google staff (what a fun job that must be).

Google determines the order in which ads appear in their results using a secret algorithm based on how much you bid, how frequently people click your ads and possibly other factors, such as how long people spend on your site after clicking. Nobody really knows apart from Google, and they aren’t saying. The higher your click frequency, generally the higher your ad will appear. The higher your ad appears in the results, generally the more clicks you will get. So writing relevant ads is very important. This means that each adgroup should have a tightly clustered group of keywords and the ads should be closely targeted to those keywords.

There is no point paying for clicks from people who aren’t interested in your product, so you need to clearly describe what you are offering in the few words available. For example you might want to have a title “anti-virus software” instead of “anti-virus products” to ensure you aren’t paying for useless clicks from people looking for anti-viral drugs (setting “drugs” as a negative keyword would also help here).

I have separate campaigns for separate geographic areas. Each campaign contains the same keywords in the same adgroups, but with potentially different bid prices and ads. This allows me to customise the bid prices and ads for the different geographic areas. For example I can quote the £ price in a UK ad and the $ price in a US ad. Having separate campaigns for separate geographic areas is a hassle, but it is manageable, especially using Google Adwords editor.

Writing landing pages specific to each adgroup can also help to increase your conversion rate. It is worth noting that the ad destination url doesn’t have to match the display url. For example you could have a destination url of “http://www.myapp.com/html/landingpage1.html?ad_id=123″ and a display url of “www.myapp.com/freetrial”.

Obviously what makes for good ad copy varies hugely with your market. Here are some things to try:

  • a call to action (e.g. “try it now!”)
  • adding/removing the price
  • different capitalisation and punctuation
  • keyword insertion (much beloved of EBay)
  • changing the destination url

But, as always, the only way to find out what really works is testing. Google have made this pretty easy with support for conversion tracking and detailed reporting. I run at least 2 ads in each adgroup and usually more. Over time I continually kill-off under-performing ads and try new ones. Often the new ads will be created by slight variations on successful ads (e.g. changing punctuation or a word) or splicing two successful ads together (e.g. the title from one and the body from another). This evolutionary approach (familiar to anyone that has written a genetic algorithm) gradually increases the ‘fitness’ of the ads. But you need to decide how to measure this fitness. Often it is obvious that one ad is performing better than another. But sometimes it can be harder to make a judgment. If you have an ad with a 5% click-through rate (CTR) and 0.5% conversion rate is this better than an ad with a 1% click-through rate and a 2% conversion rate? One might think so ( 5*0.5 > 1*2 ) but this is not necessarily the case. I think the key measure of how good an ad is comes from how much it earns you for each impression your keywords get.

I measure the fitness by a simple metric ‘profit per thousand impressions’ (PPKI) where, for a given time period:

PPKI = ( ( V * N ) - C ) / ( I / 1000 )

V = value of a conversion (e.g. product sale price)

N = number of conversions (e.g. product sales) from the ad

C = total cost of clicks for the ad

I = impressions for the ad

Say your product sells for $30. Over a given period you have 2 ads in the same adgroup that each get 40k impressions and clicks cost  an average of $0.10 per click.

  • ad1 has a CTR of 5%, a conversion rate of 0.5% and gets 10 conversions, which gives PPKI=$2.5 per thousand impressions
  • ad2 has a CTR of 1%, a conversion ratio of 2% and gets 8 conversions, which gives PPKI=$5 per thousand impressions

So ad2 made, on average, twice the profit per impression despite the lower number of conversions. Given this data I would replace ad1 with a new ad. Either a completely new ad or a variant of ad2.

PPKI has the advantage of being quantitative and simple to calculate. You can just export your Google Adwords ‘Ad Performance’ reports to Excel and add a PPKI column. Some points to bear in mind:

  • Selling software isn’t free. You may want to subtract the cost of support, CD printing & postage, ecommerce fees, VAT etc from the sale price to give a more accurate figure for the conversion value.
  • PPKI doesn’t take account of the mysterious subtleties of Google’s ‘quality score’. For example an ad with low CTR and high conversion rate might conceivably have a good PPKI but a poor quality score. This could result in further decreases in CTR over time (as the average position of the ad drops) and rises in minimum bid prices for keywords.
  • PPKI is a simple metric I have dreamt up, I have no idea if anyone else uses it. But I believe it is a better metric than cost per conversion, or any of the other standard Google metrics.

To ensure that all your ads get shown evenly select ‘Rotate: Show ads more evenly’ in your Adwords campaign settings. If you leave it at the default ‘ Optimize: Show better-performing ads more often’ Google will choose which ads show most often. Given a choice between showing the ads that make you most money and the ads which make Google most money, which do you think Google will choose?

Text ads aren’t the only type of ads. Google also offer other types of ads, including image and video ads. I experimented with image ads a few years ago, but they got very few impressions and didn’t seem worth the effort at the time. I may experiment with video ads in the future.

The effectiveness of ads can vary greatly. Back in mid-December I challenged some Business Of Software forum regulars to ‘pimp my adwords’ with a friendly competition to see who could write the best new ads for my own Google Adwords marketing. The intention was to inject some fresh ‘genes’ into my ad population while providing the participants with some useful feedback on what worked best. Although it is early days, the results have already proved interesting (click the image for an enlargement):

adwords ad results

The graph above shows the CTR v conversion ratio of 2 adgroups, each running in both USA and UK campaigns. Each blue point is an ad. The ads, keywords and bid prices for each ad group are very similar in each country (any prices in the ads reflect the local currency for the campaign). Points to note:

  • There were enough clicks for the CTR to be statistically significant, but not for the conversion rate (yet).
  • The CTRs vary considerably within the same campaign+adgroup. Often by factor of more than 3.
  • Adgroup 1 performs much better in the USA than in the UK. The opposite is true for adgroup 2.
  • Adgroup 1 for the USA shows an inverse correlation between CTR and conversion rate. I often find this is the case - more specific ads mean lower CTR but higher conversion rates and higher profits.

‘Pimp my adwords’ will continue for a few more months before I declare a winner. I will be reporting back on the results in more detail and announcing the winner in a future post. Stay tuned.

    Optimising your application

    When I first released PerfectTablePlan I considered 50-200 guests as a typical event size, with 500+ guests a large event. But my customers have been using the software for ever larger events, with some as large as 3000 guests. While the software could cope with this number of guests, it wasn’t very responsive. In particular the genetic algorithm I use to optimise seating arrangements (which seats people together or apart, depending on their preferences) required running for at least an hour for the largest plans. This is hardly surprising when you consider that seating assignment is a combinatorial problem in the same NP-hard class as the notorious travelling salesman problem. The number of seating combinations for 1000 guests in 1000 seats is 1000!, which is a number with 2,658 digits. Even the number of seating combinations for just 60 guests is more than the number of atoms in the known universe. But customers really don’t care about how mathematically intractable a problem is. They just want it solved. Now. Or at least by the time they get back from their coffee. So I made a serious effort to optimise the performance in the latest release, particularly for the automatic seat assignment. Here are the results:

    ptp308_vs_ptp_310.png

    Total time taken to automatically assign seats in 128 sample table plans varying in size from 0 to 1500 guests

    The chart shows that the new version automatically assigns seats more than 5 times faster over a wide range of table plans. The median improvement in speed is 66%, but the largest plans were solved over ten times faster. How did I do it? Mostly by straightening out a few kinks.

    Some years ago I purchased my first dishwasher. I was really excited about being freed from the unspeakable tyranny of having to wash dishes by hand (bear with me). I installed it myself - how hard could it be? It took 10 hours to do a wash cycle. Convinced that the dishwasher was faulty I called the manufacturer. They sent out an engineer who quickly spotted that I had kinked the water inlet pipe as I had pushed the dishwasher into place. It was taking at least 9 hours to get enough water to start the cycle. Oops. As soon as the kink was straightened it worked perfectly, completing a cycle in less than an hour. Speeding up software is rather similar - you just need to straighten out the kinks. The trick is knowing where the kinks are. Experience has taught me that it is pretty much impossible to guess where the performance bottlenecks are in any non-trivial piece of software. You have to measure it using a profiler.

    Unfortunately Visual Studio 2005 Standard doesn’t seem to include profiling tools. You have to pay for one of the more expensive versions of Visual Studio to get a profiler. This seems rather mean. But then again I was given a copy of VS2005 Standard for free by some nice Microsofties - after I had spent 10 minutes berating them on the awfulness of their “works with vista” program (shudder). So I used an evaluation version of LTProf. LTProf samples your running application a number of times per second, works out which line and function is being executed and uses this to build up a picture of where the program is spending most time.

    After a bit of digging through the results I was able to identify a few kinks. Embarrassingly one of them was that the automatic seat assignment was reading a value from the Windows registry in a tight inner loop. Reading from the registry is very slow compared to reading from memory. Because the registry access was buried a few levels deep in function calls it wasn’t obvious that this was occurring. It was trivial to fix once identified. Another problem was that some intermediate values were being continually recalculated, even though none of the input values had changed. Again this was fairly trivial to fix. I also found that one part of the seat assignment genetic algorithm took time proportional to the square of the number of guests ( O(n^2) ). After quite a bit of work I was able to reduce this to a time linearly proportional to the number of guests (O(n) ). This led to big speed improvements for larger table plans. I didn’t attempt any further optimisation as I felt was getting into diminishing returns. I also straightened out some kinks in reading and writing files, redrawing seating charts and exporting data. The end result is that the new version of PerfectTablePlan is now much more usable for plans with 1000+ guests.

    I was favourably impressed with LTProf and will probably buy a copy next time I need to do some optimisation. At $49.95 it is very cheap compared to many other profilers (Intel VTune is $699). LTProf was relatively simple to use and interpret, but it did have quirks. In particular, it showed some impossible call trees (showing X called by Y, where this wasn’t possible). This may have been an artefect of the sampling approach taken. I will probably also have a look at the free MacOSX Shark profiler at some point.

    I also tried tweaking compiler settings to see how much difference this made. Results are shown below. You can see that there is a marked difference with and without compiler optimisation, and a noticeable difference between the -O1 and -O2 optimisations (the smaller the bar, the better, obviously):

    vs2005_optimisation_speed.png

    Effect of VS2005 compiler optimisation on automatic seating assignment run time

    Obviously the results might be quite different for your own application, depending on the types of calculations you are doing. My genetic algorithm is requires large amounts of integer arithmetic and list traversal and manipulation.

    The difference in executable sizes due to optimisation is small:

    vs2005_optimisation_size.png

    I tried the two other optimisation flags in addition to -O2.

    • /OPT:NOWIN98 - section alignment does not have to be optimal for Windows 98.
    • /GL - turns on global optimisation (e.g. across source files, instead of just within source files).

    Neither made much noticeable difference:

    vs2005_additional_opt.png

    However it should be noted that most of the genetic algorithm is compiled in a single file already, so perhaps /GL couldn’t be expected to add much. I compared VC++6 and VS2005 version of the same program and found that VS2005 was significantly faster[1]:

    vc6_vs_vs2005_optimisation_speed1.png

    I also compared GCC compiler optimisation for the MacOSX version. Compared with VS2005 GCC has a more noticeable difference between optimised and unoptimised, but a smaller difference between the different optimisations:

    gcc_optimisation_speed.png

    Surprisingly -O3 was slower than -O2. Again the effect of optimisation on executable size is small.

    gcc_optimisation_size2.png

    I also tested the relative speeds of my 3 main development machines[2]:

    relative-machine-speed.png

    It is interesting to note that the XP box runs the seat assignment at near 100% CPU utilisation, but the Vista box never goes above 50% CPU utilisation. This is because the Vista box is a dual core, but my the seat assignment is currently only single threaded. I will probably add multi-threading in a future version to improve the CPU utilisation on multi-core machines.

    In conclusion:

    • Don’t assume, measure. Use a profiler to find out where your application is spending all its time. It almost certainly won’t be where you expected.
    • Get the algorithm right. This can make orders of magnitude difference to the runtime.
    • Compiler optimisation is worth doing, perhaps giving a 2-4 times speed improvement over an application built without compiler optimisation. It probably isn’t worth spending too much time tweaking compiler settings though.
    • Don’t let a software engineer fit your dishwasher.

    Further reading:

    “Programming pearls” by Jon Bentley a classic book on programming and algorithms

    “Everything is fast for small n” by Jeff Atwood on the Coding Horror blog

    [1] Not strictly like-for-like as the VC++6 version used dynamic Qt libraries, while the VS2005 version used static Qt libraries.

    [2] I am assuming that VS2005 and GCC produce comparably fast executables when both set to -O2.

    Beware upgradeware

    fungi.jpgSome years back my wife bought a PC and got a ‘free’ inkjet printer with it. It was a really lousy printer, but hey, it was free. When it ran out of ink we tried to get a new inkjet cartridge, but the cheapest set of cartridges we could find was £80. That was 4 times the price of other comparable cartridges at the time. Some further research showed that you could buy the printer for £20 - with cartridges! Their ugly sales tactics didn’t work. We threw it in the dustbin and bought an Epson inkjet, which gave years of sterling service using third party sets of cartridges costing less than £10.

    When I started my company I had a thousand decisions to make. One of them was which software to use to create and maintain my new product website. It just so happened that my new ISP (1and1.co.uk) was offering a bundle of ‘free software worth £x’ when you signed up (I forget the amount). It included a web design package (NetObjects Fusion 8 ) and an FTP package (WISE-FTP). Hoorah, free (as in beer) software and 2 less decisions to make. I was weak. Instead of spending time checking out reviews and evaluating competitors, I just installed and starting using them. It didn’t occur to me that they might be using the same sales tactics as the manufacturer of the lousy printer. In this imperfect world, if something appears too good to be true, it usually is. And so it was in this case. I grew to hate both these pieces of software.

    WISE-FTP was just flaky. It kept crashing and displaying German error messages, despite the fact that I had installed the English version. No problem, I just uninstalled and installed FileZilla which is free (as in beer and speech), stable and does everything I need and more.

    NetObjects Fusion was flaky and hard to use. By saving after every edit I could minimise the effects of the regular crashes and I assumed that I would learn how to work around other problems in time. But I never did. By the time I decided that the problems were more due to the shortcomings of NetObjects Fusion as a software package, rather than my (many) shortcomings as a web designer, it was a little late. I had already created an entire website, which was now stored in NetObjects Fusion’s proprietary database. Some of the bugs in NetObjects Fusion are so major that one wonders how much testing the developers did. My ‘favourite’ is the one where clicking a row in a table causes the editor to scroll to the top the table. This is infuriating when you are editing a large table (my HTML skills haven’t yet reached the 21st century).

    In despair I eventually paid good money to upgrade to NetObjects Fusion 10. Surely it would be more stable and less buggy after two major version releases? Bzzzzt, wrong. The table scrolling bug is still there and it crashed 3 times this morning in 10 minutes. Also, every time I start it up the screen flashes and I get the ominous Vista warning message “The color scheme has been changed to Windows Vista Basic. A running program isn’t compatible with certain visual elements of Windows”. Even just trying to buy the software upgrade off their website was a confusing nightmare. The trouble is that it is always easier in the short-term to put up with NetObject Fusion’s many shortcomings than to create the whole site anew in another package.

    For want of a better term I call this sort of software ‘upgradeware’ - commercial software that is given away free in the hope that you will buy upgrades. This is quite distinct from the ‘try before you buy’ model, where the the free version is crippled or time-limited, or freeware, for which there is no charge ever. Upgradeware is the software equivalent of giving away a printer in the hope that you will buy overpriced cartridges. Only it is less risky, as the cost of giving away the software is effectively zero. It seems to be a favoured approach for selling inferior products and it is particularly successful when there is some sort of lock-in. It certainly worked for NetObjects in my case.

    Norton Anti-virus are the masters of upgradeware. Norton Anti-virus frequently comes pre-installed on new PCs with a free 1-year subscription. The path of least resistance is to pay for upgrades when your free subscription runs out. By doing these deals with PC vendors, Symantec sell vast amounts of subscriptions, despite the fact that Norton Anti-virus has been shown in test after test to be more bloated and less effective than many of its competitors. And if you think Norton Anti-virus doesn’t have any lock-in, just try uninstalling it and installing something else. It is almost impossible to get rid of fully. Last time I tried I ended up in a situation where it said I couldn’t uninstall it, because it wasn’t installed, and I couldn’t re-install, because it was still installed.

    I feel slightly better now that I have had a rant about some of my least favourite software. But there is also a more general point - ‘free’ commercial software can end up being very expensive. Time is money and I hate to think how much time I have wasted struggling with upgradeware. So be very wary of upgradeware, especially if there is any sort of lock-in. When I purchased a new Vista PC, the first thing I did was to reinstall Vista to get rid of all the upgradeware that Dell had installed (Dell wouldn’t supply it to me without it). You could also draw the alternative conclusion that upgradeware might be a good approach for making money from lousy software. But hang your head in shame if you are even thinking about it. It would be better for everyone if you just created a product that was good for customers to pay for it up-front.

    Ps/ If you fancy the job of converting www.perfecttableplan.com to beautiful sparkly clean XHTML/CSS and your rates are reasonable - feel free to contact me with a quote.

    The other side of the interface

    all_seeing_eyes.jpgWhile researching my talk on usability for ESWC2007 I came across this article I wrote some years ago. It has quite a lot of material I would have liked to have included, but there is only so much you can fit into a 60 minute talk. I am putting it here as a supplement to the talk and as a tribute to the late lamented EXE magazine which first published the article in June 1998. EXE went under in 2000 and I can’t find anyone to ask permission to republish it. I think they wouldn’t have minded. It is quite a long article and may be truncated by feed readers. Click through to the site to read the whole article.

    It has been said that if users were meant to understand computers they would have been given brains. But, in fairness to users, the problem is often that interfaces are not designed to take account of their strengths and weaknesses. I have struggled with my fair share of dire user interfaces, and I’m supposed to be an expert user.

    An interface is, by definition, a boundary between two systems. On one side of a user interface is the computer hardware and software. On the other side is the user with (hopefully) a brain and associated sensory systems. To design a good interface it is necessary to have some understanding of both of these systems. Programmers are familiar with the computer side (it is their job after all) but what about the other side? The brain is a remarkable organ, but to own one is not necessarily to understand how it works. Cognitive psychologists have managed to uncover a fair amount about thought processes, memory and perception. As computer models have played quite a large role in understanding the brain, it seems only fair to take something back. With apologies to psychologists everywhere, I will try to summarise some of the most important theory in the hope that this will lead to a better understanding of what makes a good user interface. Also, I think it is interesting to look at the remarkable design of a computer produced by millions of years of evolution, and possibly the most sophisticated structure in the universe (or at least in our little cosmic neighbourhood).

    The human brain is approximately 1.3kg in weight and contains approximately 10,000,000,000 neurons. Processing is basically digital, with ‘firing’ neurons triggering other neurons to fire. A single neuron is rather unimpressive compared with a modern CPU. It can only fire a sluggish maximum of 1000 times a second, and impulses travel down it a painfully slow maximum of 100 meters per second. However, the brain’s architecture is staggeringly parallel, with every neuron having a potential 25,000 interconnections with neighbouring neurons. That’s up to 2.5 x 10^14 interconnections. This parallel construction means that it has massive amounts of store, fantastic pattern recognition abilities and a high degree of fault tolerance. But the poor performance of the individual neurons means that the brain performs badly at tasks that cannot be easily parallelised, for example arithmetic. Also the brain carries out its processing and storage using a complex combination of electrical, chemical, hormonal and structural processes. Consequently the results of processing are probabilistic, rather than deterministic and the ability to store information reliably and unchanged for long periods is not quite what one might hope for.

    Perhaps unsurprisingly, the brain has a similar multi-level storage approach to a modern computer. Where a computer has cache, RAM and hard-disk memory (in increasing order of capacity and decreasing order of access speed) the brain has sensory memory, short-term memory and long-term memory. Sensory memory has a large capacity, but a very short retention period. Short-term memory has a very small capacity but can store and retrieve quickly. Long-term memory has a much larger capacity, but storage and retrieval is more difficult. New information from sensory memory and knowledge from long-term memory are integrated with information in short-term memory to produce solutions.

    memory_model.gif

    A simple model of memory and problem solving[1].

    Sensory memory acts like a huge register, retaining large amounts of sensory data very briefly so that it can be processed into a meaningful form, e.g. to recognise a face, which is transferred to short-term memory. The sensory data is then quickly replaced with new incoming data.

    Short-term memory acts like a very small queue with a limited retention period. It can hold only 7±2 items of information, with new items added into short-term memory displacing older ones once this limit has been reached. Items disappear after approximately 30 seconds if not rehearsed. The items of information in short-term memory act as ‘pointers’ to arbitrarily large and complex pieces of information stored in long-term memory. For example the seventh of January is one chunk for me (its my birthday), 2 chunks to you (one for each familiar word) and 14 chunks for a non-English speaker familiar with our alphabet (one for each character). The number 7±2 may seem rather arbitrary, but experimentation shows it is remarkably consistent across a wide range of individuals and cultures. Short-term memory acts as a workspace for problem solving. The more items that are held in short-term memory the longer it takes to process them.

    It is important not to overload short-term memory. The limited size of short-term memory is a critical bottleneck in problem solving and one of the main constraints to consider for any user interface (designed for human users at least). Don’t force the user to try to hold lots of items in short-term memory. If they have to think about more than 7±2 items then new items will displace old ones. Also the more items that are in short-term memory the slower their response time will be. Having lots of ‘open’ tasks puts a big burden on short-term memory, so tasks should be grouped into well-defined ‘transactions’. Complex tasks can almost always be broken down into simpler sub-tasks.

    Long-term memory acts like a huge network database. It has a complex structure and massive capacity, but storing and retrieving information is slow and not always reliable. Items of information are apparently interconnected and accessed by some form of pointer. Some psychologists believe that long-term memory may be permanent, and only the ability to retrieve it may be lost (a bad case of ‘dangling pointers’ perhaps?). Dreaming may be a side-effect of routine re-structuring of long-term memory (garbage collection?) while we are asleep. Transferring information to long-term memory seems to be a process of encoding the memory and creating pointers to access it. The more often an item of information is accessed the easier it becomes to access in future. Each item of information may be accessible by many different routes. Consequently the context in which information is presented can be important factor in remembering. The more context cues that are available the easier it is to retrieve an item from long-term memory. For example, experiments show that students perform better in exams held in the classroom where they learnt the information than elsewhere. So if an item was presented in a particular font, colour and size, it will be easier to remember its meaning if the same font, colour and size are used.

    There is some evidence that image and verbal memories are stored in different parts of the brain. We can often generally the faces of people we have met better than their names. Experiments show that it is easier to remember an image than a concrete word, for example it is easier to remember ‘car’ when shown an image of a car than when shown the word ‘car’. It is also easier to remember a concrete word than an abstract word, for example it is easier to remember the word ‘car’, than the word ‘transport’. This implies that the iconic representation of commands on toolbars has value beyond just looking nice. Also keywords used in a command line interface should where possible be concrete, rather than abstract.

    The different types of memory are stored using different physical mechanisms, probably electrical, chemical and structural. As proof of this you can train an animal to run a maze, cool it down to the point where all brain activity ceases and then warm it up again. It will have forgotten how to run the maze, but remember things it learnt days before (I don’t recommend you try this with users). Also some diseases have been observed affect short-term memory without affecting long-term memory. Transfer information from short-term to long-term memory and retrieving it again is not very reliable. It is better to allow the user to select from alternatives rather than force them to commit items to long-term memory and then retrieve them. At work, the interface of our old accountancy package had many short-comings. Projects had to be identified as 5 digit numerical codes, even though alphabetic codes would have been easier to remember. Users also had to enter project numbers from memory, no facility for selecting from available projects was provided. It wouldn’t have taken much effort to produce a better interface design, just a little thought. For example the Microsoft Word print dialog cues the user as to the permitted format for specifying pages to be printed.

    example.gif

    A useful aid to memory.

    The brain gets its input from the outside world through the senses. Of the senses vision is the most important, with some 70% of all sensory receptors in the eyes. The importance of vision is also reflected in the design of modern computers. Other than the odd beep the computer communicates with the user almost entirely through the VDU. Consequently I will confine the scope of the discussion on the senses to vision alone.

    The eye is an impressive sensing device by any standards. Tests show that its is possible for a human eye to detect a candle flame at a range of 30 miles on a dark, still night. This corresponds to detecting a signal as low as a few photons entering the eye. Incoming light is focused onto the retina at the back of the eye, which contains the light receptors. The retina is actually an extension of the brain. Observation of growing embryos shows that the tissue that forms the retina extends from the brain, it is not formed from the tissue that turns into the rest of the eye. The retina contains some 5 million ‘cone’ receptors and 100 million ‘rod’ receptors. The cones are sensitive to colour, while the rods are sensitive to brightness. Some cones are sensitive to red, some to green and some to blue, depending on the pigment they contain. The cones are much more liberally supplied with nerve cells and are able to discern detail, but they don’t function in low light levels. The cones are densest in the centre of the retina, and virtually absent at the outer edge. The fovea centralis, a spot 1 millimetre across at the centre of the retina, contains some 200,000 cones and no rods. The rods only detect light at the blue end of the spectrum, but they are extremely sensitive and can detect a single photon of light. The uneven distribution of rods and cones is easy to test. Look slightly away from this page and try to read it ‘out of the corner of your eye’ – its not possible. Only the fovea has sufficient acuity to discern this level of detail. You may also notice that it is easiest to see poorly illuminated objects out of the corner of your eye. A very dim star, visible out of the corner of your eye, disappears when looked straight at.

    Because the fovea is so small we are only able to distinguish detail over a range of approximately 2 degrees. This equates to about 2.5cm at the normal distance from user to VDU. To build up a detailed picture of what is on the screen we have to scan it. It therefore makes sense to have single items on the interface not bigger than 2.5cm, so they can be recognised without have to scan them. Games and simulators that perform real-time rendering are wasting a lot of processing power by rendering the whole picture at the same level of detail. What they should ideally be doing is performing very detailed rendering at the point where the user’s fovea is pointing and progressively less detailed rendering further away from this. This would allow a much more efficient use of available processing power. It is possible to detect where the user is looking by bouncing an infrared beam off their retina. If this technology becomes widely available it could be used to perform differential rendering, with the result appearing much more detailed without any increase in processing power.

    The receptors in the retina, in common with other sense receptors, are only sensitive to change. Using special optical equipment it is possible to project a ‘stabilised’ image onto the retina that does not change, regardless of eye movements. A stabilised image fades to a formless grey and is no longer discernible after only 2-3 seconds. It turns out that the constant movement of the eye, originally thought to be an imperfection of the optical system, is essential for sensing unchanging images. Perversely, light has to pass through 9 layers of nerves cells and blood vessels in the retina before it reaches the light receptors (I guess evolution isn’t perfect). Because the network of nerves and bloods vessels is unchanging, we don’t normally perceive it[2]. The practical consequence is that any form of movement, animation, change in intensity or flashing on a user interface is extremely noticeable. Flashing should be used sparingly as it can be distracting and fatiguing to users. Quickly changing text is also difficult to read, this is why, in our digital age, car speedometers remain as analogue dials rather than numerical LEDs. It may be better to put a flashing symbol next to steady text, this draws attention to the text without reducing its legibility. Mosier and Smith[3] recommend a flash rate between 2-5 Hz, with a minimum ‘on’ time of at least 50 percent. Large flashing areas of colour are believed to aggravate epilepsy (particularly at certain frequencies) and should not be used.

    While sensation happens in the eye, perception happens in the brain. The receptors in the retina convert the light to electrical signals which they pass to the brain through the optic nerve, a bundle of approximately 1,000,000 neurons. The information is processed in the visual cortex, the surface of the brain at the back of the head. Our perception is incredibly sophisticated, as artificial intelligence researchers have found to their cost. Experiments on the cortex shows that it has evolved with built-in ‘feature detectors’. A feature detector is a neuron that fires for a very particular stimulus. For example, one neuron in the visual cortex may fire if there is a horizontal line at the top-left of the visual field. Next to it will be a neuron that fires for a slightly different orientation, length or position. Additional processing is then carried out to integrate all the information from the different feature detectors.

    As you are reading this page your eye is making rapid movements, with your brain recognising the shape of 2-3 words at a time before moving on to the next group of words (the maximum number of words recognised at a time presumably being limited by the size of the fovea). This is apparently being done by information from different feature detectors being integrated very quickly. For example the word ‘FIX’ can be broken down into six straight lines at different positions in the visual field. We are able to recognise this work in about a third of a second, even though the size and font may vary. Shape recognition is therefore incredibly efficient and seems to be one of the best developed features of our visual system. Tests show that objects can be recognised just as well from line drawings as from colour photographs. A cup is recognisable as a cup because of its shape, not because of its colour, orientation etc. Textual representations are not always the best way to convey information. A map, chart, diagram or other form of image will often convey the same information quicker.

    icons-in-explorer.gif

    The use of icons in Windows Explorer makes it easier to browse document types than would be possible by reading the file extensions.

    Tests show that our ability to pick out simple features such as length, orientation, curvature and brightness are carried at a very low level, in parallel. Consequently we can pick out items based on these features in a constant time, regardless of the number of other items in the image. Careful use of these abilities allow a great deal of information to be filtered very rapidly by the user.

    shapes1.gif

    The anomalous shape is detected as quickly in b) as in a), even though there are three times as many targets.

    But the brain is not so good at integrating (‘conjoining’) different types of feature, for example shape and brightness. It is easy to pick out a black icon or a circular icon, but picking out a black circular icon is more difficult and time consuming.

    shapes2.gif

    Time taken to pick out the black circle increases at the number of targets increases.

    It follows from this that you should try to distinguish features of the interface by shape or brightness or orientation, but not a combination of these factors.

    optical.gif

    a) the horizontal and vertical lines are the same length. b) the vertical lines are the same length.

    The visual cortex carries out a great deal of processing that we are unaware of, not least of which is turning the image of the world the right way up. Even though we can understand the nature of illusions, our visual system is still fooled. This is because it is not just sensing the world, but trying to interpret it, making use of all sorts of cues and in-built knowlege, and this is happening at a lower level than we can consciously control. You may not have even noticed that there was a deliberate spelling mistake in the last sentence because your perceptual system made a sensible guess.

    Although the image projected onto our retina is two dimensional we have very well developed depth perception, our ancestors wouldn’t have been able to swing through the trees without it. Partly this is because having two eyes allows stereoscopic vision, but also because our brain processes lots of other visual cues that produce a sensation of depth, even where it doesn’t exist (for example in a photograph). The main cues are:

    • More distant objects are smaller
    • More distant objects appear closer to the ‘vanishing point’ created by converging parallels
    • More distant objects move across the visual field more slowly
    • Superposition, if A overlaps B then A must be closer
    • Shadows and highlights
    • Chromostereopsis, long wavelength colours (e.g. red) appear closer than shorter wavelength colours (e.g. blue) because shorter wavelength light is refracted more strongly by the lens of the eye (but this is rather weak compared to the other effects)

    depth-cues.gif

    Use of depth cues make the one shape appear closer than the other.

    Using these cues can give a very effective illusion of depth, without specialised equipment such as stereoscopic googles. This built-in depth perception is currently taken advantage of only in a very limited way in most GUI environments, for example the use of highlights and shadows to infer a three dimensional element for controls. Many applications would benefit from a three dimensional representation. For example the structure of a complex web site could be better presented in three dimensions than two. The availability of VRML and other technologies is likely to make three dimensional interfaces increasingly common.

    buttons.gif

    An illusion of depth.

    Interestingly it is purely a matter of convention and practise that makes us imagine the light source as at the top-left and see the top button as sticking out and the bottom button as sticking in[4]. If you can also see them the other way around if you try.

    Layout is an important feature of an interface. Western users will tend to scan an screen as if they were reading a page, starting from the top-left. Scanning can be made easier by aligning controls in rows. Complex displays can be made easier to scan by adding additional cues, for example a timesheet could have a thicker line denoting the end of each week.

    Both layout and similarity can be used to group items on an interface.

    grouping.gif

    In a) the shapes are perceived as 3 rows, while in b) they are perceived as 3 columns, due to proximity. In c) the shapes are perceived as 3 columns, due to similarity. d) gives a mixed message.

    A colour is perceived according to how strongly it activates the red, green and blue cone receptors in our eyes. From this we perceive its intensity (how bright it is), its hue (the dominant wavelength) and saturation (how wide a range of wavelengths make it up). Within the 400-700 nanometer visible range we can distinguish wavelengths 2 nanometers apart. Combined with differing levels of hue and saturation the estimated numbers of colours we can discriminate is 7,000,000. According to the US National Bureau of Standards there are some 7,500 colours with names. But colour should be used sparingly in interfaces. I once worked on an application where a very extrovert student with dubious taste (as evidenced by his choice of ties) had designed the user interface. Each major type of window had a different lurid background colour. This was presumably to make it easy to tell them apart, but the overall effect was highly distracting.

    Colour perception, like everything else to do with perception, is complex. Experiments show that how we perceive a colour depends on the other colours surrounding it. If you look through a pinhole at a sheet of green or red paper it doesn’t appear to have a very strong colour. But if you put the sheets next to each other and look at them both through the pinhole the colours appear much stronger. So if you want to make a colour highly visible, put it next to a complementary colour, for example yellow is perceived by red and green cone cells, so to make it more visible put it next to an area of saturated blue.

    Colour can be used with text and symbols to add information without making them less legible, as long as a careful choice of colours is used. Some combinations of colours work better than others. Saturated blue appears dimmer to the human eye than other saturated colours and is more difficult to focus on. Blue symbols and text are therefore probably best avoided. However, for the same reasons, blue can make a background that is easy on the eye. Saturated yellow appears brighter than all the other colours for the same intensity.

    colours1.gif

    Ill-advised colour combinations.

    colours2.gif

    Better colour combinations.

    Designers should remember that a significant proportion of the population has deficient colour vision (some 6% of males and 0.4% of females, the difference being due to the way the defective gene is inherited). This is caused by problems with pigmentation in one or more of the red, green and blue cone cells in the eye. While there are a range of different types of colour deficiency the most common is the inability to distinguish between red and green. This raises some questions about the design of traffic lights (some colour-deficient drivers have to rely on the position, rather than the colour, of the lights). Some individuals may not be able to distinguish one or more primary colours from grey, it is therefore unwise to put a primary colour on a dark background. Allowing users to customise colours goes some way to alleviating this problem.

    Other forms of vision defect are also common, as evidenced by the number of people wearing glasses. Something that is easily visible on the programmer’s 17 inch screen may be almost impossible to read on a user’s LCD laptop screen. This problem is further compounded by the fact that eyesight deteriorates with age and programmers tend to younger on average than users. There also seems to be a tendency to use ever smaller fonts even though screen sizes are increasing. Perhaps this is based on the assumption that large fonts make things look childish and unsophisticated, so small fonts must look professional. Ideally the user should be able to customise screen resolution and font sizes.

    Meaning can sometimes be conveyed with colour, for example a temperature scale may be graded from blue (cold) to red (hot) as this has obvious physical parallels. But the meaning of colour can be very culturally dependent. For example, red is often used to imply danger in the west, but this does not necessarily carry over into other cultures. The relative commonness of defective colour vision and the limited ability of users to attach meaning to colour means that it should be used as an additional cue, and should not be relied on as the primary means of conveying information. Furthermore colour won’t be visible on a monochrome display (now relatively rare) or a monochrome printer (still very common).

    Humans are good at recognising patterns, making creative decisions and filtering huge amounts of information. Humans are not so good at arithmetic, juggling lots of things at once and committing them to long-term memory. Computers are the opposite. A good interface design should reflect the respective strengths and weaknesses of human and computer. Just as a well crafted graphical user interface will minimise the amount of machine resources required to run it, it should also minimise the amount of brain resources required to use it, leaving as much brain capacity as possible for the user to solve their actual problem.

    [1] After “Psychology”, 2nd Ed, C.Wade and C.Tavris.

    [2] However it can be seen under certain conditions. Close one eye and look through a pinhole in a piece of card at a well illuminated sheet of white paper. If you waggle the card from side to side you start to see the network of blood vessel.

    [3] “Guidelines for Designing User Interface Software” by Smith and Mosier (1986). Several hundred pages of guidelines for user interface design. They betray their 80’s US Air Force sponsored origins in places, but are still excellent. For the dedicated.

    [4] I have since found out that this may not be true. Our brains appeared to be hardwired to assume that the lighting comes from above. For more details see: “Mind Hacks” T.Stafford & M.Webb (2005).

    Next Page »