Tag Archives: bots

The mystery of the Chinese downloads

A spider (probably not Chinese)It’s a good to idea to regularly look through the logs of your website. You’ll often find something interesting. In March 2013 I was looking through the web logs for my seating planner software and I noticed the number of downloads of the Windows version of my software had gone up by a factor of 5, compared to the previous month. Everything else stayed pretty much the same:

  • The number of visits to the download page hardly changed.
  • The number of completed Windows installs hardly changed.
  • The number of downloads of my Mac installer hardly changed.

Odd. On further investigation it turned out that a number of Chinese IP addresses were downloading my Windows installer again and again. My software is not localised into Chinese and I get very few sales from China. Also there were no installs from these IP addresses (my software puts up a ‘thank you for trying’ page when it is first run). It was a substantial increase in bandwidth, but not enough to be a serious denial of service attack. Very odd.

I am on an unlimited bandwidth hosting contract so I wasn’t paying for the extra bandwidth. But I was worried that the volume of requests would slow down my web site. So I put a .htaccess file in the downloads directory to block the worst offenders.

After a few months I got the bandwidth from China down from ~30GB per day to ~100MB per day. I have been playing this game of ‘whack a mole’ every since. Currently I have some 1700 Chinese IP addresses blocked.

downloads per month

PerfectTablePlan for Windows downloads per month 2013/2014

As an example I recently blocked IP, which was downloading PerfectTablePlan around 20 times per day, but never visiting a page on my website.

Here are the logs from one day (via Web Log Storming), picked at random before I blocked their IP:

logsAnd here is one of those records in more detail:

logWeb Log Storming classifies it as a ‘spider’.  whois.domaintools.com says the IP belongs to ‘China Mobile Communications Corporation’. The IP is not listed on projecthoneypot.org and I wasn’t able to find out any more from casual Googling.

To block the this IP I just added this line to my .htaccess file:

Deny from

But it is a bit of a nuisance to keep having to do this.

Other software companies are having similar issues. But I haven’t come across any compelling answers about why this is happening. Perhaps it is a way of masking some other nefarious activity? Does anyone have any idea what is going on?