Monthly Archives: July 2008

Early registration for ESWC2008 closing

David Boventer has just reminded me that early bird registration for ESWC 2008 ends July 31st. I probably won’t make it this year due to other committments. But I have been the last two years and highly recommend it to any microISVs that can make it to Berlin for 8/9 November. Hurry up – you haven’t got much time to get the early bird rate!

Early home of computing falling into disrepair

image from wikipedia

Bletchley Park is a location of huge significance in the history of both the UK and the IT industry worldwide. It was at Bletchley that British codebreakers[1], including early computer science genius Alan Turing, broke the ‘unbreakable’ Nazi Enigma code during WWII. As part of this work they designed and built Colossus, arguably the first programmable electronic computer.

The breaking of the Enigma code had a huge impact on the war. Many historians believe it shortened the war significantly and saved many lives (on the winning side, at least). But the codebreaker’s huge achievements were kept secret for many years after the war, receiving no public recognition. Turing himself committed suicide after shameful treatment at the hands of the British government.

Now, to compound the neglect, Bletchley has been left to fall into increasing state of disrepair due to a lack of funding. The site was only just saved from property developers in 1991. Things have now got sufficiently bad that 97 prominent IT experts and computer scientists have written a letter to the Times this month condemning the state of repair. The Bletchley Park Trust are doing the best they can, but receive no public funding.

We are just about surviving. Money—or lack of it—is our big problem here. I think we have two to three more years of survival, but we need this time to find a solution to this.” Simon Greenish, director Bletchley Park Trust

It would be a tragedy if such a historic site was not saved. So what can you do? If you are a UK citizen you can:

and anyone can:

I hope the government will wake up to the fact that we are losing a site of national and international historical importance. Lets hope they don’t leave it too late to act.

** Interesting trivia **

The ability to solve The Daily Telegraph crossword in under 12 minutes was used as a recruitment test for codebreakers. The newspaper was asked to organise a crossword competition, after which each of the successful participants was contacted and asked if they would be prepared to undertake “a particular type of work as a contribution to the war effort”. (from Wikipedia)

[1] Building on important earlier work by Polish codebreakers.

A mathematical digression

I need some help with a mathematical problem. A geometry problem to be specific.

Congratulations on reading this far. One of the features of Perfect Table Plan is the drawing of tables and seats on scale floor plans. The user can optionally specify how many seats and let the software calculate a sensible table size so that all the seats touch the table and their neighbouring seats. This saves the user time and produces tidy looking floor plans. Calculating the table size is trivial for square or rectangular tables. It is a bit more complicated for circular tables. But, after a bit of head scratching, I managed to work it out. Placing the seats around the circle is then trivial.

But my customers keep asking for oval (elliptical) tables, with that callous disregard customers have for how difficult a problem might be [1]. Here is the problem.

We have an ellipse E with axes A and B surrounded by N identical circles of diameter D. Each circle touches the ellipse and each of its 2 neighbours at one point, as shown above. Given N, D and the ratio A/B what is A? Given A, what is the angle THETA subtended by the centre of each circle to the centre of E?

I doubt there is an exact analytic solution to this problem. I have some vague ideas about how to tackle it. I can work out the approximate circumference C’ of an ellipse E’ that passes through the centre of all the seats (axes A+D and B+D) using the formula derived by Indian mathematical prodigy Ramunajan.

From this we should be able to work back to A. As N becomes large C’ will tend to N*D. For smaller N, C’ will diverge from N*D so we might have to use an iterative method[2] to calculate A, but we can use the approach above to get a starting value for A and then iterate numerically from there.

I am less sure about how to work out the angle THETA for each circle. But if we pre-compute the angles of, say, 100 equally spaced points around E we could use these to interpolate the position of N circles where N is <= 100. It might be OK to place the first circle at THETA=0 for all values of N>0, I’m not sure.

Several hours on Google didn’t turn up a solution. Surely I am not the only person to have tackled this problem in human history? Can anyone point me at a workable solution? Preferably with code.

Alternatively can somebody write me the code to solve the problem? Maybe there is someone out there with a mathematical background that would relish the challenge? I am prepared to pay for working code that I can use in PerfectTablePlan (a few hundred dollars, negotiable).

  1. To simplify things we can assume a fixed value of A/B, say 1.5 .
  2. It needs to work for N from 1 to 99.
  3. The solution doesn’t need to be exact, but it has to look OK to the human eye. No overlaps and no big gaps.
  4. Low values of N might need to set up as special cases. E.g. it isn’t possible to get all the circles to touch if N <=6 (and possibly higher values of N depending on A/B).
  5. The solution must be returned in a reasonable time, ideally under 0.001 seconds and definitely less than 0.01 seconds. It can store pre-computed values, e.g. in an array. But it mustn’t require excessive memory.
  6. The code needs to be in a form I can easily convert to C++. C, Java, BASIC or Python should be fine. Haskell not so much.
  7. Ideally it should come with a simple GUI that allows me to set the value of N and D and see the result visually.

If you want to be paid I need to be able to buy all rights to the code and it mustn’t be released into the public domain (i.e. don’t post the code on this blog). In the unlikely event I get more than one set of working code, I will pay for the best solution according to the above criteria. Contact me for more details.

[1] I love them all really.

[2] For example Newton-Rhapson.

**** UPDATE ****

See http://successfulsoftware.net/2008/08/25/a-mathematical-digression-revisited/ .

PayPal vs GoogleCheckout revisited

I wrote back into December 2007 that 70% of my customers prefer PayPal over GoogleCheckout, given the choice. I re-checked the figures today to see if GoogleCheckout was gaining traction with my customers. It isn’t.

% of UK customers[1] choosing PayPal vs GoogleCheckout by month

I’m glad. Despite PayPal’s recent flakiness (since improved) and higher transaction fees[2], I still prefer them as a payment processor due to Google’s confidential email option (which a pain in the butt for support), lack of multi-currency support, chargeback fees and slow processing of many orders. It is useful to have an alternative to PayPal though.

[1] GoogleCheckout only lets me accept payment in pounds sterling, so I only offer it to UK customers.

[2] For a £19.95 transaction PayPal charges me £0.68 and GoogleCheckout charges me £0.45. But Google currently refunds transaction fees for 10x my adwords spend, meaning I don’t pay any transaction fees at all to Google in a typical month.

Consulting testimonial: Tudumo

After I did some consulting for Richard Watson of Tudumo.com he was kind enough to send me this testimonial:

Once I’d finished with the major part of Tudumo development, I got to a point where I needed to take stock of the situation. Rather than making every mistake myself, I thought it would be much better to hire Andy to take a look at my application and essentially ask him “if this was yours, what would you do now?”

It turns out that’s exactly what his approach to your business is. We had a couple of phone discussions after which he scoured my approach and website, applying his experience to my situation.

I was left with a six-page action list, which serves me in a number of ways:
1) It validates what I was doing right
2) Points me in some new directions
3) Gives me an actionable set of tasks which serve as a periodic reminder of which tasks will give me most benefit.
4) A few Andy-only tricks that I hope my competitors don’t get!

In fact, forget it – big waste of time. ;0)

If you are looking for a simple and slick TODO list application I recommend you take a look at Tudumo.

Could your business use an independent and experienced perspective?

Using defence in depth to produce high quality software

‘Defence in depth’ is a military strategy where the attacker is allowed to penetrate the defender’s lines, but is then gradually worn down by successive layers of defences. This strategy was famously used by the Soviet Army to halt the German blitzkrieg at the battle of Kursk, using a vast defensive network including trenches, minefields and gun emplacements. Defence in depth also has parallels in non-military applications. I use a defence in depth approach to detect bugs in my code. A bug has to pass through multiple layers of defences undetected before it can cause problems for my customers.

Layer 1: Compiler warnings

Compiler warnings can help to spot many potential bugs. Crank your compiler warnings up to maximum sensitivity to get the most benefit.

Layer 2: Static analysis

Static analysis takes over where compiler warnings leave off, examining code in great detail looking for potential errors. An example static analyser is Gimpel PC-Lint for C and C++. PC-Lint performs hundreds of checks for known issues in C/C++ code. The flip side of it’s thoroughness is that it can be difficult to spot real issues amongst the vast numbers of warnings and it can take some time to fine-tune the checking to a useful level.

Layer 3: Code review

A fresh set of eyes looking at your code will often spot problems that you didn’t see. There are various ways to go about this, including formal Fagan inspections, Extreme Programming style pair programming and informal reviews. There is quite a lot of documented evidence to suggest that this is one of the most effective ways to find bugs. It is also an excellent way to mentor less experienced programmers. But it is time consuming and can be hard on the ego of the person being reviewed. Also it isn’t really an option for solo developers

Layer 4: Self-checking

Of the vast space of states that a program can occupy, usually only a minority will be valid. E.g. it might makes no sense to set a zero or negative radius for a circle. We can check for invalid states in C/C++ with an assert() macro:

class Circle
{
    public:
        void setRadius( double radius );
    private:
        double m_radius;
}

void Circle::setRadius( double radius )
{
    assert( radius > 0.0 );
    m_radius = radius;
}

The program will now halt with a warning message if the radius is set inappropriately. This can be very helpful for finding bugs during testing. Assertions can also be useful for setting pre-conditions and post-conditions:

    void List::remove( Item* i )
    {
        assert( contains( i ) );
        ...
        assert( !contains( i ) );
    }

Or detecting when an unexpected branch is executed:

    switch ( shape )
    {
        case Shape::Square:
            ...
        break;

        case Shape::Rectangle:
            ...
        break;

        case Shape::Circle:
            ...
        break;

        case Shape::Ellipse:
            ...
        break;

        default:
            assert( false ); // shouldn't get here
        break;
    }

Assertions are not compiled into release versions of the software, which means they don’t incur any overhead in production code. But this also means:

  • Assertions are not a substitute for proper error handling. They should only be used to check for states that should never occur, regardless of the program input.
  • Calls to an assert() must not change the program state, or the debug and release versions will behave differently.

Different languages have different approaches, for example pre and post conditions are built into the Eiffel language.

Layer 5: Dynamic analysis

Dynamic checking usually involves automatically instrumenting the code in some way so that it’s runtime behaviour can be checked for potential problems such as: array bound violations, reading memory that hasn’t be written to and memory leaks. An example dynamic analyser is the excellent and free Valgrind for Linux. There are a few dynamic analysers for Windows, but they tend to be expensive. The only one I have tried in the last few years was Purify and it was flaky (do IBM/Rational actually use their own tools?).

Layer 6: Unit testing

Unit testing requires the creation of a test harness to execute various tests on a small unit of code (typically a class or function) and flag any errors. Ideally the unit tests should then be executed every time you make a change to the code. You can write your own test harnesses from scratch, but it probably makes more sense to use one of the existing frameworks, such as: NUnit (.NET), JUnit (Java), QUnit (Qt) etc.

According to the Test Driven Development approach you should write your unit tests before you write the code. This makes a lot of sense, but requires discipline.

Layer 7: Integration testing

Integration testing involves testing that different modules of the system work correctly together, particularly the interfaces between your code and hardware or third party libraries.

Layer 8: System testing

System testing is testing the system in it’s entirety, as delivered to the end-user. System testing can be done manually or automatically, using a test scripting tool.

Unit, integration and system testing should ideally be done using a coverage tool such as Coverage Validator to check that the testing is sufficiently thorough.

Layer 9: Regression testing

Regression testing involves running a series of tests and comparing the results to the same input data run on the previous release of the system. Any differences may be the result of bugs introduced since the last release. Regression testing works particularly well on systems that take a single input file and produce a single output file – the output file can just be diff’ed against the previous output.

Layer 10: Third party testing

Different users have different patterns of usage. You might prefer drag and drop, someone else might use right-click a lot and yet another person might prefer keyboard accelerators. So it would be unwise to release a system that has only ever been tested by the developer. Furthermore, the developer inevitably makes all sorts of assumptions about how the software will be used. Some of those assumptions will almost certainly be wrong.

There are a number of companies that can be paid by the day to do third party testing. I have used softwareexaminer.com in the past with some success.

Layer 11: Beta testing

End-user systems can vary in processor speed, memory, screen resolution, video card, font size, language choice, operating system version/update level and installed software. So it is necessary to test your software on a representative range of supported hardware + operating system + installed software. Typically this is done by recruiting users who are keen to try out new features, for example through a newsletter. Unfortunately it isn’t always easy to get good feedback from beta testers.

Layer 12: Crash reporting

If each of the above 11 layers of defence catches 50% of the bugs missed by the previous layer, we would expect only 1 bug in 2,048 to make it into production code undetected. Assuming your coding isn’t spectacularly sloppy in the first place, you should end up with very few bugs in your production code. But, inevitably, some will still slip through. You can catch the ones that crash your software with built-in crash reporting. This is less than ideal for the person whose software crashed. But it allows you to get detailed feedback on crashes and consequently get fixes out much faster.

I rolled my own crash reporting for Windows and MacOSX. On Windows the magic function call is SetUnhandledExceptionFilter. You can also sign up to the Windows Winqual program to receive crash reports via Windows’ own crash reporting. But, after my deeply demoralising encounter with Winqual as part of getting the “works with Vista” logo, I would rather take dance lessons from Steve Ballmer.

Test what you ship, ship what you test

A change of a single byte in your binaries could be the difference between a solid release and a release with a showstopper bug. Consequently you should only ship the binaries you have tested. Don’t ship the release version after only having tested the debug version and don’t ship your software after a bug fix without re-doing the QA, no matter how ‘trivial’ the fix. Sometimes it is better to ship with minor (but known) bugs than to try to fix these bugs and risk introducing new (and potentially much worse) bugs.

Cross-platform development

I find that shipping my software on Windows and MacOSX from a single code base has advantages for QA.

  • different tools with different strengths are available on each platform
  • the Gnu C++ compiler may warn about issues that the Visual Studio C++ compiler doesn’t (and vice versa)
  • a memory error that is intermittent and hard to track down on Windows might be much easier to find on MacOSX (and vice versa)

Conclusion

For the best results you need your layers of checks to be part of your day-to-day development, not something you do just before a release. This is best done by automating them as much as possible, e.g.:

  • setting the compiler to treat warnings as errors
  • performing static analysis and unit tests on code check-in
  • running regression tests on the latest version of the code every night

Also you should design your software in such a way that it is easy to test. E.g. building in log file output can make it much easier to perform regression tests.

Defence in depth can find a high percentage of bugs. But obviously the more bugs you start with the more bugs that will end up in your code. So it doesn’t remove the need for good coding practices. Quality can’t be ‘tested in’ to code afterwards.

I have used all 12 layers of defence above at some point in my career. Currently I am not using static analysis (I must update that PC-Lint licence), code review (I am a solo developer) and dynamic analysis (I don’t currently have a dynamic analyser for Windows or MacOSX). I could also do better on unit testing. But according to my crash reporting, the latest version of PerfectTablePlan has crashed just three times in the last 5000+ downloads (the same bug each time, somewhere deep down in the Qt print engine). Not all customer click the ‘Submit’ button to send the crash reports and crashes aren’t the only type of bug, but I think this is indicative of a good level of quality. It is probably a lot better than most of the other consumer software my customers use[1]. Assuming the crash reporting isn’t buggy, of course…

[1]Windows Explorer and Microsoft Office crash on a daily basis on my current machine.

Microsoft adCenter over reporting conversions

I have long suspected that Microsoft adCenter is over reporting conversions. Here is the confirmation from my adCenter reporting:

I am guessing that the purchaser visited the ‘thank you for your purchase’ page (which contains the conversion tracking script) 5 times, for whatever reason. I can’t think of any other way this situation could occur – the conversion tracking isn’t set up to take account of multiple purchases in one transaction. How difficult would it be to only count the first visit? Google can do it.

Being cynical, perhaps the over reporting suits Microsoft? But it makes it much more difficult for me to assess the real effectiveness of keywords and ads. Another good reason to concentrate my efforts on Google Adwords instead.