Monthly Archives: December 2007

The curse of the 419 scam

I expect that anyone reading this blog has had hundreds, if not thousands, of emails like this:

NIGERIAN NATIONAL PETROLEUM CORPORATION
CORPORATE HEADQUARTERS,LAGOS
STRICTLY CONFIDENTIAL.
FROM THE DESK OF: DR WALI AHMED .
LAGOS-NIGERIA

Dear Sir/Madam ,

After due deliberation with my colleagues, I decided to forward this proposal. We want a reliable person who could assist us to transfer the sum of Thirty Million United States Dollars (US$30,000,000.00) into his/her account.

..etc

Or

My father (Late) DR EDWARD HOSANNA the former Deputy Minister of Finance under the executive civilian president of Liberia, but was assasinated by the rebels during the civil war and properties destroyed, but I narrowly escaped with some very important documents of (US$7.5M) Seven Point Five Million U.S Dollars deposited by my late father in a high financial company here in Dakar-Senegal under my name as next of kin.

..etc

(picking two at random from my inbox).

Needless to say, it is a scam. Basically, they ask for a sum of money (e.g. to bribe an official) with the promise that you will get a much larger sum once the transaction is complete. But each payment results in a request for a larger payment, until you run out of money. It is often known as the ‘Nigerian 419 scam’, as many of these emails originate from Nigeria and they are an offence under article 419 of the Nigerian criminal code. It is a variant on the spanish prisoner scam, which dates back to the early 1900s. It is hard to believe anyone would fall for this scam, but they do in their thousands. Advance fee fraud (e.g. 419 scams) was estimated to cost at least £275 million in 2005 in the UK alone, with average individual loss to victims over £31,000. Greed and stupidity are indeed a dangerous combination.

So what can we do to hit back? Not a great deal. The governments of poor and corrupt countries probably don’t care that much about gullible and greedy westerners being cheated. In fact, the scam may be in their interests. We can report the scammer’s email to their ISP, in the hope that it will be shut down before they can con anyone. But they can easily get another. We can play along a bit and waste the scammer’s time (perhaps with highly entertaining results). But it wastes our time as well. However another idea occurred to me while listening to a BBC radio report from Nigeria.

Apparently many Nigerians are incredibly superstitious, even the highly educated ones. Rumours of penis stealing witches and killer phone numbers are taken very seriously. So now I occasionally respond to 419 emails with a curse email from a little used email account. My email starts with some impressive looking pig Latin. It then tells them that reading the above has activated a curse and that they will suffer increasingly bad headaches until they renounce their wicked ways. If they are sufficiently superstitious I figure this might be enough to start a headache, which will get worse the more they worry about it. Hopefully they will either find an honest way to make a living or a psychosomatic feedback loop will cause their head to explode like a scene from the film Scanners (warning: very gory). I have no idea if it works – but I think it is worth a try. I haven’t included the text of my email as I don’t want it to appear on Google. Make up your own curse. Be inventive.

Site uptime monitoring

siteuptimeMy PerfectTablePlan website and all the associated websites (such as http://www.weddingtableplan.com) have gone down three times this week. Sigh. The first time they went down for 5 hours, the second time for 3.5 hours and the third time for 6 hours. Perhaps they have been overdoing the festivities at my ISP, 1and1.co.uk . I am not impressed. Somebody needs a good kick up the arse.

To rub salt into the wound they even had the cheek to put up parking pages with ads in place of my sites. Some of these ads were for my own sites, which also displayed parking pages  – with more ads. So 1and1 were potentially taking money off me through adwords at the same time! I have a feeling 1and1 and I may be parting company in the not too distant future.

I am now setting up a copy of my site with a different ISP. If (when?) the site goes down again I should be able to point the DNS to the backup ISP.

At least I found out about it quickly due to site monitoring service www.siteuptime.com . I use their free service, which is adequate for my needs at present. Are you monitoring your site(s)? Do you have a back-up plan?

Installing MacOSX 10.5 (Leopard) on an external harddisk

install macosx 10.5 leopardI need to support both MacOSX 10.4 (Tiger) and 10.5 (Leopard) for the latest release of PerfectTablePlan. I could have created a new partition on the current harddisk for 10.5, but apparently you can’t do that without erasing the whole disk. I really didn’t want to mess with my existing 10.4 setup, so I purchased a 320GB WD MyBook USB/Firewire external harddisk to install 10.5 on to. 320GB for £75, bargain! But I had quite a bit of trouble installing Leopard on to it. After about the tenth time looking at a “Mac OS X could not be installed on your computer. The installer cannot prepare the volume for installation.” message I finally got it working. In case anyone else gets stuck, here are some hints:

  • When you set up the new harddisk partitions using Disk Utility make sure you choose Apple Partition Map using the Options button (it may be set to Master Boot Record if the disk is shipped set-up for Windows).
  • Disconnect the harddisk USB cable. Just use the Firewire cable.

I hope this saves someone else a few hours. Thanks to Jeff B for a hint that got me moving in the right direction.

Optimising your application

When I first released PerfectTablePlan I considered 50-200 guests as a typical event size, with 500+ guests a large event. But my customers have been using the software for ever larger events, with some as large as 3000 guests. While the software could cope with this number of guests, it wasn’t very responsive. In particular the genetic algorithm I use to optimise seating arrangements (which seats people together or apart, depending on their preferences) required running for at least an hour for the largest plans. This is hardly surprising when you consider that seating assignment is a combinatorial problem in the same NP-hard class as the notorious travelling salesman problem. The number of seating combinations for 1000 guests in 1000 seats is 1000!, which is a number with 2,658 digits. Even the number of seating combinations for just 60 guests is more than the number of atoms in the known universe. But customers really don’t care about how mathematically intractable a problem is. They just want it solved. Now. Or at least by the time they get back from their coffee. So I made a serious effort to optimise the performance in the latest release, particularly for the automatic seat assignment. Here are the results:

ptp308_vs_ptp_310.png

Total time taken to automatically assign seats in 128 sample table plans varying in size from 0 to 1500 guests

The chart shows that the new version automatically assigns seats more than 5 times faster over a wide range of table plans. The median improvement in speed is 66%, but the largest plans were solved over ten times faster. How did I do it? Mostly by straightening out a few kinks.

Some years ago I purchased my first dishwasher. I was really excited about being freed from the unspeakable tyranny of having to wash dishes by hand (bear with me). I installed it myself – how hard could it be? It took 10 hours to do a wash cycle. Convinced that the dishwasher was faulty I called the manufacturer. They sent out an engineer who quickly spotted that I had kinked the water inlet pipe as I had pushed the dishwasher into place. It was taking at least 9 hours to get enough water to start the cycle. Oops. As soon as the kink was straightened it worked perfectly, completing a cycle in less than an hour. Speeding up software is rather similar – you just need to straighten out the kinks. The trick is knowing where the kinks are. Experience has taught me that it is pretty much impossible to guess where the performance bottlenecks are in any non-trivial piece of software. You have to measure it using a profiler.

Unfortunately Visual Studio 2005 Standard doesn’t seem to include profiling tools. You have to pay for one of the more expensive versions of Visual Studio to get a profiler. This seems rather mean. But then again I was given a copy of VS2005 Standard for free by some nice Microsofties – after I had spent 10 minutes berating them on the awfulness of their “works with vista” program (shudder). So I used an evaluation version of LTProf. LTProf samples your running application a number of times per second, works out which line and function is being executed and uses this to build up a picture of where the program is spending most time.

After a bit of digging through the results I was able to identify a few kinks. Embarrassingly one of them was that the automatic seat assignment was reading a value from the Windows registry in a tight inner loop. Reading from the registry is very slow compared to reading from memory. Because the registry access was buried a few levels deep in function calls it wasn’t obvious that this was occurring. It was trivial to fix once identified. Another problem was that some intermediate values were being continually recalculated, even though none of the input values had changed. Again this was fairly trivial to fix. I also found that one part of the seat assignment genetic algorithm took time proportional to the square of the number of guests ( O(n^2) ). After quite a bit of work I was able to reduce this to a time linearly proportional to the number of guests (O(n) ). This led to big speed improvements for larger table plans. I didn’t attempt any further optimisation as I felt was getting into diminishing returns. I also straightened out some kinks in reading and writing files, redrawing seating charts and exporting data. The end result is that the new version of PerfectTablePlan is now much more usable for plans with 1000+ guests.

I was favourably impressed with LTProf and will probably buy a copy next time I need to do some optimisation. At $49.95 it is very cheap compared to many other profilers (Intel VTune is $699). LTProf was relatively simple to use and interpret, but it did have quirks. In particular, it showed some impossible call trees (showing X called by Y, where this wasn’t possible). This may have been an artefect of the sampling approach taken. I will probably also have a look at the free MacOSX Shark profiler at some point.

I also tried tweaking compiler settings to see how much difference this made. Results are shown below. You can see that there is a marked difference with and without compiler optimisation, and a noticeable difference between the -O1 and -O2 optimisations (the smaller the bar, the better, obviously):

vs2005_optimisation_speed.png

Effect of VS2005 compiler optimisation on automatic seating assignment run time

Obviously the results might be quite different for your own application, depending on the types of calculations you are doing. My genetic algorithm is requires large amounts of integer arithmetic and list traversal and manipulation.

The difference in executable sizes due to optimisation is small:

vs2005_optimisation_size.png

I tried the two other optimisation flags in addition to -O2.

  • /OPT:NOWIN98 – section alignment does not have to be optimal for Windows 98.
  • /GL – turns on global optimisation (e.g. across source files, instead of just within source files).

Neither made much noticeable difference:

vs2005_additional_opt.png

However it should be noted that most of the genetic algorithm is compiled in a single file already, so perhaps /GL couldn’t be expected to add much. I compared VC++6 and VS2005 version of the same program and found that VS2005 was significantly faster[1]:

vc6_vs_vs2005_optimisation_speed1.png

I also compared GCC compiler optimisation for the MacOSX version. Compared with VS2005 GCC has a more noticeable difference between optimised and unoptimised, but a smaller difference between the different optimisations:

gcc_optimisation_speed.png

Surprisingly -O3 was slower than -O2. Again the effect of optimisation on executable size is small.

gcc_optimisation_size2.png

I also tested the relative speeds of my 3 main development machines[2]:

relative-machine-speed.png

It is interesting to note that the XP box runs the seat assignment at near 100% CPU utilisation, but the Vista box never goes above 50% CPU utilisation. This is because the Vista box is a dual core, but my the seat assignment is currently only single threaded. I will probably add multi-threading in a future version to improve the CPU utilisation on multi-core machines.

In conclusion:

  • Don’t assume, measure. Use a profiler to find out where your application is spending all its time. It almost certainly won’t be where you expected.
  • Get the algorithm right. This can make orders of magnitude difference to the runtime.
  • Compiler optimisation is worth doing, perhaps giving a 2-4 times speed improvement over an application built without compiler optimisation. It probably isn’t worth spending too much time tweaking compiler settings though.
  • Don’t let a software engineer fit your dishwasher.

Further reading:

“Programming pearls” by Jon Bentley a classic book on programming and algorithms

“Everything is fast for small n” by Jeff Atwood on the Coding Horror blog

[1] Not strictly like-for-like as the VC++6 version used dynamic Qt libraries, while the VS2005 version used static Qt libraries.

[2] I am assuming that VS2005 and GCC produce comparably fast executables when both set to -O2.

Brand recognition: PayPal beats Google

I offer both PayPal and GoogleCheckout as payment option on my pounds sterling payment page (GoogleCheckout only allows me to price in pounds sterling, unfortunately). As GoogleCheckout is effectively free to me at present[1] I put the GoogleCheckout button on the left in the hope of getting more payments through Google. But 70.5% of purchasers clicked on the PayPal button.

I have since then become a bit disgruntled with GoogleCheckout for their slow processing times, chargeback fees, lack of multi-currency support and use of anonymised email addresses[2]. So I swapped the button order in the hope of increasing the number of purchasers using PayPal. 69.3% of purchasers now click on the PayPal button.

paypal-vs-googlecheckout.gif

From this I conclude that GoogleCheckout still has a long way to go to beat PayPal in brand recognition, positioning on the left may not be more prominent (although 1.2% may be statistical noise) and button order is less important than I thought. Or perhaps the PayPal icon is just more compelling. I wonder if GoogleCheckout have tested their icon against the PayPal icon?

[1] Google currently process £10 of payments free for each £1 I spend on Adwords.

[2] The user can opt to have their email anonymised at time of purchase. The vendor then recieves an email address like Miss-abc123xyz@checkout.l.google.com. Google forwards email from this address to the purchaser, until they choose not to receive further emails. In theory this protects the purchaser from vendor spam, but in reality it makes support more difficult. For example, the purchaser can’t retrieve their key from your online key retrieval system unless they remember to use the anonymised address (they never do).

The software awards scam (update)

This is an update on my The software awards scam post in August.  Below is an updated list of the download site awards I ‘won’ for software that didn’t even run. 23 in total, and that is only the ones I am aware of.

awardmestars awards

The article got a surprising amount of interest, including front page mentions on reddit, digg, slashdot and wordpress and a mention in the Guardian newspaper (they were too mean to give the URL of the article). There were also some entertaining reviews posted on download sites. The page has so far had over 150,000 hits, 263 comments and has a Google page rank of 6. I hope this exposure will make a small contribution to ending this sordid little practice.

It has been quite instructive to be on the receiving end of the news, albeit in a small way. Much of the commentary was inaccurate. One ‘journalist’ from ZDNET Belgium/Holland even managed to get both my first name and last name wrong, which is quite a feat considering we had exchanged several emails. I don’t know how many other mistakes he made, because the rest of the article was in Dutch or Flemish. I wonder if the mainstream media is much better. Definitely don’t believe everything you read.

First charge-back from GoogleCheckout

google_checkout2.gifI have just had my first charge-back through GoogleCheckout. I shouldn’t moan at one charge-back in 8 months use as my secondary payment processor – except:

  • the credit card address was in the UK, the IP address was in the Netherlands and the email address was .ru (Russian Federation)
  • the payment failed authorisation twice, before succeeding a third time

Despite the above, Google apparently just processed the payment automatically, without referring it for further checks. How many Google Phds does it take to write a scoring system that can figure out that this was a suspect transaction? To rub a bit more salt in the wound Google have debited a £7.00 charge-back fee on top of refunding the payment.

I guess Google must need the money.