Author Archives: Andy Brice

How I finally beat my son at a computer game

TL;DR: I cheated, using programming.

I play computer games with my son. But he is 14 and I am 54, so I just can’t compete on reflexes. Just yesterday he thrashed me 10-3 at the silly and fun Spelunky Deathmatch. Then he gloated about my pitiful score.

spelunky

We’ve also been writing our own games together in Python, for fun and so that I can teach him some programming. We’ve written a little jet dogfight game together. You each get a little plane that can turn left, right, accelerate or shoot. You score 2 points for shooting down your opponent and 1 point for flying over a powerup. First to 20 points wins. We are both Python novices (my day job is writing software in C++), so the program is quite hacky. Lots of globals and cut and paste. The planes are triangles and the clouds are square. But it is a fast and fun game to play.

plane-game

Predictably my son was winning most the games. Then gloating about it. However I had recently seen an article about an AI winning dogfights against a human fighter pilot. This gave me an idea. While he was asleep I modified the program so that you can press a key to toggle a cheat mode on either plane. In the cheat mode pressing the left key aims automatically at the powerup and pressing the right key aims automatically at the opponent’s plane. Suddenly I started thrashing him. He got suspicious and insisted we swap planes. Which is fine, I just toggled cheat mode on the other plane. He got even more  suspicious. I told him I had been practising. He went off to practise with an old version of the code I gave him. I then thrashed him several more times and told him I have being doing a lot of practise. ;0)

Sooner or later he will figure out what is going on. I’m not sure what the take away lesson will be. Coding is a powerful skill? Don’t trust your Father?

I offer up the code for any competitive Dad’s (or Mum’s) who feel they need a little help against their cocky offspring. See how long you get away with cheat mode. You can always toggle it off for a while when they get suspicious.

Notes about the game

You can download the game’s Python code.

The game runs from inside the Python variant of the free processing.org environment, which you can download here for Windows, Mac or Linux. You need to select Sketch>Import Library to get the Sound library. File>Open the .pyde file in processing.org and press the run button. You can adjust the szx and szy variables according to your screen size. There seems to be a bug where the sound only works for the first game after you start the IDE.

The keyboard controls are:

Player 1:

a – turn left (aim for powerup in cheat mode)

d – turn right (aim for opponent in cheat mode)

w – accelerate

s – fire

z – toggle cheat mode (off at start)

Player 2:

left cursor – turn left (aim for powerup in cheat mode)

right cursor – turn right (aim for opponent in cheat mode)

up cursor – accelerate

down cursor – fire

end – toggle cheat mode (off at start)

Currently the aim cheat aims at where the opponent’s plane is. To be a bit more sophisticated, it could aim at where it thinks the opponent’s plane will be. But the current approach turns out to be good enough, and less likely to make an opponent suspicious.

It would be interesting to write a little AI that completely controls the plane and then put it up against other people’s AIs. A future project perhaps. But processing.org isn’t an ideal environment for that.

Google Ads can charge you anything they like for a click on their partner network

I have been using Google Ads (previously Adwords) since 2005. A lot has changed in that in time. But it still remains basically an auction. You bid how much you are prepared to pay for a click when someone types a particular phrase into Google. Google then decides which ads to show based on bids and a variety of factors. It has always been a basic article of faith that (if you choose manual bidding) you are never charged more than the maximum cost per click you set. But this is no longer true. Check out this recent Google Ads report for my data transformation software, Easy Data Transform.

adwords-cpc

How can the average cost per click be 3x more than the maximum cost per click? I am using manual bidding, am not using ‘enhanced CPC’, have no positive bid adjustments and I haven’t recently changed these bids. I emailed Google technical support. This is the reply I got:

Post reviewing the search campaign : Easy Data Transform, I can see that  you have applied Bidding strategy as Manual CPC, also you have included Google search partners due to which system has charged extra amount. As in Google search partners maximum capping would not work.

For example the keywords: [keyword redacted] in this the maximum cpc is set as while as you have included Google search partners due to which system has avg cpc charged £0.45.

‘Google search partners’ are “hundreds of non-Google websites, as well as YouTube and other Google sites”. New Adwords campaigns are opted in to search partners by default.

So I emailed back:

So when did this policy change?

Is there a limit what you can change for Search partners clicks? If I bid a Max CPC of £0.10 CPC can Adwords charge me £100.00, if it feels like it?

And the reply was:

Andy, I would like to inform you that there is no recent changes made in this policy. However, I would not be able to provide the exact date of change made in this policy. If you have applied the  Max CPC of £0.10, system may charge you £100 if the search partner is enabled at your campaign level.

WHAT? They can ignore my maximum bid and charge me £100 per click (if I am using manual bidding, tracking conversions and opted in to search partners)! I don’t trust Google’s algorithms to bid for me based on prior experience. Presumably they can’t charge more than my daily budget, but they can use this whole budget in a single click, if they choose.

Running a report you can see that the average cost per click of the partner network (red) has shot up recently for my Easy Data Transform campaign:

adwords-network

Digging a bit more I found this.

smart-bidding

So it looks like they quietly introduced this new policy in October 2018. I don’t remember being told about this. It doesn’t mention this policy if you hover over max CPC:

adwords-tip

It is only if you click ‘Learn more’ and read to the bottom of that page that they tell you about the exception for network partners.

I checked with my goto Google Ads expert, Aaron of softwarepromotions.com. He knows Google Ads inside out, but he hadn’t heard of this policy either.

Google Ads have a long history of quietly introducing major changes and not telling their customers, let alone asking them to opt-in. For example with in-app ads.

This change is particularly galling given that clicks from the partner network are generally lower quality and convert significantly worse than clicks from Google itsself.

So far the costs to me of this change don’t appear to be significant. But they could be very significant to people with larger accounts. I did a quick surf and I found this on webmasterworld.com, dated March 2019:

Don’t have all the numbers yet but it looks like so far this month about 60% of our spend this month has been on these inflated clicks. And when I say inflated, on the extreme end we have a keyword set to $0.27 CPC and the average CPC has been $4.67, and almost all clicks for the keyword have been from the search partner network.

I have turned off the Google partner network on all my campaigns. You might want to do the same.

disable

Easy Data Transform v1.6.0

I have been working hard on Easy Data Transform. The last few releases have included:

  • a 64 bit version for Windows
  • JSON, XML and vCard format input
  • output of nested JSON and XML
  • a batch processing mode
  • command line arguments
  • keyboard shortcuts
  • various improvements to transforms

Plus lots of other improvements.

The installer now includes 32 and 64 bit version of Easy Data Transform for Windows and installs the one appropriate to your operating sytem. This is pretty easy to with the Inno Setup installer. You just need to use Check: Is64BitInstallMode, for example:

[Files]
Source: "..\binaries\windows\program\{#MyAppExeName}"; DestDir: "{app}"; Flags: ignoreversion; Check: not Is64BitInstallMode
Source: "..\binaries\windows\program64\{#MyAppExeName}"; DestDir: "{app}"; Flags: ignoreversion; Check: Is64BitInstallMode

But it does pretty much double the size of the installer (from 25MB to 47MB in my case).

The 32 bit version is restricted to addressing 4GB of memory. In practise, this means you may run out of memory if you get much above a million data values. The 64 bit version is able to address as much memory as your system can provide. So datasets with tens or hundreds of millions of values are now within reach.

I have kept the 32 bit version for compatibility reasons. But data on the percentage of customers still using 32 bit Windows is surprisingly hard to come by. Some figures I have seen suggest <5%. So I will probably drop 32 bit Windows support at some point. Apple, being Apple, made the transition to a 64 bit OS much more aggressively and so the Mac version of Easy Data Transform has always been 64 bit only.

I have also been doing some benchmarking and Easy Data Transform is fast. On my development PC it can perform an inner join of 2 datasets of 900 thousand rows x 14 columns in 5 seconds. The same operation on the same machine in Excel Power Query took 4 minutes 20 seconds. So Easy Data Transform is some 52 times faster than Excel Power Query. Easy Data Transform is written in C++ with reference counting and hashing for performance, but I am surprised by just how much faster it is.

The Excel Power Query user interface also seems very clunky by comparison. The process to join two CSV files is:

  • Start Excel.
  • Start Power Query.
  • Load the two data files into Excel power query.
  • Choose the key columns.
  • Choose merge.
  • Choose inner join. Wait 4 minutes 20 seconds.
  • Load the results back into Excel. Wait several more minutes.
  • Save to a .csv file.
  • Total time: ~600 seconds

Whereas in Easy Data Transform you just:

  • Start Easy Data Transform.
  • Drag the 2 files onto the center pane.
  • Click ‘join’.
  • Select the key columns. Wait 5 seconds.
  • Click ‘To file’ and save to a .csv file.
  • Total time: ~30 seconds

join-op

If you have some data to transform, clean or analyze please give Easy Data Transform a try. There is a fully functional free trial. Email me if you have any questions.

 

Analyzing COVID19 data with Easy Data Transform

I have continued to make lots of improvements to Easy Data Transform, including:

Here is a video of me using Easy Data Transform to analyze the data.europa.eu COVID19 dataset. Hopefully it gives an idea of what the software is capable of.

covid19-data-wrangling

 

 

Easy Data Transform v1.1.0

I released v1.1.0 of Easy Data Transform this week. It is a big upgrade, with some major new features.

easy-data-transform-v110

There is a new Javascript transform. This allows you to create custom transforms for anything that is too specialist to do with the other 37 built-in transforms. I’m not a fan of Javascript, with its horrible scoping and typing, and I would have preferred Python or Lua. But there is a Javascript engine built into Qt, so this was the easiest way to add scripting. Now if you want to multiply two columns of your data together in Easy Data Transform, you can just do this:

javascript-data-transform-v110

You can also access Javascript maths, date and string functions. So you can do some pretty complicated stuff. Hopefully the built-in transforms are enough to cover 95% of data transformations. But the new Javascript transform adds some serious flexibility for the remainder. The Qt Javascript engine is also pretty fast. In testing I was able to multiply values from 2 columns together across 10,000 rows in less than 0.03 seconds.

There is a new Lookup transform. This allows you to lookup values for one dataset in another dataset. For example, if you have a dataset with a column for country code and another dataset with columns for the country code and tax rate, you can look up the tax rate by country code.

Previously you could only output your data in Excel and delimited text formats (including CSV and TSV). The new release also adds output to JSON, HTML, Markdown, vCard, YAML and XML formats.

I have improved the speed of the Join transform significantly using hashing. This makes a big difference with large datasets.

To save time, Easy Data Transform guesses the likely columns you want to use as keys when you Join, Intersect, Lookup or Subtract two datasets. For example if 2 datasets both have colummns called ‘ID’ with lots of unique values that are common to both columns, it will choose those two columns as the default key columns. I have improved the heuristic used to set the default columns.

You can now add comments to input, transform and output nodes as a note to a colleague or your future self.

You can now snap your input, transform and output nodes to a grid, so you can keep your layout all lovely and neat.

I have also made some bug fixes and minor improvements.

Haven’t tried Easy Data Transform yet? Got some table or list data that you need to wrangle into a more useful form? Take the free trial for a spin.

 

Easy Data Transform v1.0.0 released

v100-screen-cap

I finally released a paid version of Easy Data Transform today, for both Windows and Mac. I am very pleased with how it has turned out. Obviously it is only v1.0.0, so there is plenty of additional features I could add, including:

  • Batch processing
  • Support for JSON, XML, SQLite input/output
  • More transforms
  • A 64 bit version for Windows
  • A Linux version

But I need to listen carefully to prospective customers to decide which additional features to prioritize in future releases. It might be something I haven’t even thought of.

But v1.0.0 already has a really useful core of features. And, if you aren’t embarrassed by v1.0, you didn’t release it early enough. That said, I haven’t cut corners on quality. It has proper documentation and has been through extended beta testing, dogfooding and several rounds of usability and third party testing.

The product has a fully-functional 7 (non-consecutive) day free trial. I think that is enough for prospective customers to decide if it does what they need. I also have a 60 day money-back guarantee.

I have decided to go with a subscription model: $99 / €90 / £75 + tax per person per year. Which covers up to 3 computers. At this price point I can afford some paid promotion and to provide a decent level of support. I am not offering a monthly subscription, as I don’t really want people who are going to pay for 1 month (to do their annual TPS reports) and then cancel.

Have you got some data you need to merge, clean, reformat or de-dupe? Give it a try. You can get a 25% discount if you buy a subscription by the 27th December 2019 using this link.

 

Eating my own dogfood

Eating your own dogfood There is a story that a president of a pet food company ate some of his own dog food, to show how good it was. I’m not sure how tasty dog food really needs to be, given that dogs are happy to lick their own backsides. But his commitment is admirable. The least we can do as software developers is to use our own software as much as possible. After all, if you don’t use it, how can you expect anyone else to?

In that spirit I have been using my new Easy Data Transform product as much as possible. The biggest project so far has been merging two databases, for a charity that I volunteer at. I created an Airtable database for the charity. But volunteer information was already in a separate CRM. I imported relevant CRM data into Airtable, but the CRM system remained in use for emailing volunteers for a couple of years while I concentrated on Airtable and other tasks. In that time the Airtable database has become a roaring success for the charity. So we eventually decided to retire the CRM system and also use Airtable as our CRM.

Consequently I had to merge the latest CRM data into Airtable. I exported the relevant data from each as a CSV and then proceeded to merge the mailing list tags from the CRM into a new column in Airtable. I also created tables of discrepancies for the charity staff to work through. For example, where the telephone numbers or emails had been added or updated in one database, but not the other.

When I had initially imported the CRM data into Airtable, I had imported the CRM ID record. So those records were easy to match between Airtable and the CRM using a simple join on the ID. However any records added subsequently to Airtable or the CRM did not have matching IDs. So I had to match those by first name + last name or email address. The data was quite ‘dirty’, as is invariably the case with real world data. A phone number may be “0123 456 789” on one system and “01 23456789” on another. A volunteer might be “Chris” in one database and “Christopher” in another. Also some contacts had multiple entries in the CRM system. So this was not a trivial problem.

dogfood.png

You can get an idea of what was involved from the screenshot above. The two pink input nodes are the 2 databases exported as CSV files, the blue nodes are various transforms (joining, filtering, removing spaces etc) and the green nodes are the outputs (e.g. lists of telephone and email differences, lists of people in one database, but not the other etc). Quite a lot of the transforms are just column renames (in future I should probably support renaming multiple columns in one transform).

I think this would have been a horrific task using Excel, SQL, Beyond Compare or any of the other tools I had to hand, amazing tools as they may be for other tasks. But Easy Data Transform performed brilliantly, even if I do say so myself. It was particularly helpful that you could see the whole process step-by-step and backtrack or branch at any point without losing previous changes.

While eating my own dogfood, I found one bug (related to carriage returns inside CSV records) and quite a few minor annoyances. These have now been fixed in the latest release. I also added  a new ‘Compare Columns’ transform, which was really useful for this sort of work. So it was a very useful experience and I really recommend ‘eating your own dogfood’ as much as you can, along with usability testing.

Have you got some data that needs cleaning, merging, de-duping or filtering? Analytics, log files, emailing lists, databases? Of course you do! Why not give Easy Data Transform a try. It is free while it is in beta. Let me know how you get on.

 

 

 

The Hacker News effect – wide but not deep

I posted a “Show HN” link to Hacker News on Saturday about my new product, Easy Data Transform. For some hours, nothing happened. No upvotes, no comments. One of the moderators emailed me suggesting I add my own comment. I did that and it started to get some upvotes and other comments. On Sunday I made it onto the front page of Hacker News and stayed there most of the day. The traffic to my new easydatatransform.com website jumped from negligible to around 300 page views per hour. Cool!

hacker-news-effect1

In total I got 4,400 unique page views and the discussion generated some 59 comments (including replies by me). Among the predictable ‘why isn’t a web app?’ and ‘why isn’t it on Linux’ questions, there was some useful feedback. But I was quite surprised how little most visitors engaged beyond that:

Visits to the home page: 3,289

Visits to the download page: 192

Installs of the free beta: 20

Signups to the mailing list: 0

I was also surprised that over 70% of the hits were from mobile devices and tablets. I guess that might partly account for the low download rate (Easy Data Transform is only available on Windows and Mac).

hackernews-os

But the session duration histogram tells it’s own story.

hackernew-analytics.png

0-10 seconds. Ouch.

 

 

 

Easy Data Transform video

I am continuing to work on my new product, Easy Data Transform. I have thrown together a 3 minute video to give an idea of what it is capable of.

Easy Data Transform video

You can download Easy Data Transform for Windows or Mac here. The beta is completely free, but time-limited (I do plan to start charging at some point). Please try it and let me know what you think.

I don’t really enjoy doing screencasts or voiceovers. If you can recommend someone who does slick screencasts and voiceovers (or who can polish my amateur attempts), please feel free to give them a plug in the comments.

Easy Data Transform

I have been furiously coding a new product. Easy Data Transform. It is a Windows and Mac tool for transforming table and list data from one form to another. Joining, splitting, reformatting, filtering, sorting etc.

easydatatransform

I have been thinking about this product idea for years. In fact I threw together a janky prototype back in 2008. It allows you to perform various operations on a pair of lists.

list-weaver

I used this prototype for jobs such as creating a list of emails of people who had bought Perfect Table Plan v5, but hadn’t upgraded to v6 yet. It worked. But it wasn’t very good. The biggest annoyance was that each operation obliterated everything that came before. Which made it very easy to lose track of where you had got to. And there was no repeatability. It was also limited to lists and it became clear that I really needed something that could also handle tabular data. I never released it.

But the idea has been running as a background process in my brain for 11 years since. And I think I have come up with a much better design in that time. Finally I had mature, stable versions of my Perfect Table Plan and Hyper Plan products out, so I decided to go for it. I am really pleased with how it has turned out so far.

If you aren’t embarrassed by v1.0 you didn’t release it early enough. And so I have cut lots of corners to get this first public version out. The documentation is only part written. I created the application icon myself  in 10 minutes. There is no licensing. The GUI is lacking polish. The website would make a designer cry. But the software seems fairly robust. My 13 year old son wasn’t able to crash it after 10 minutes of trying, despite financial incentives to do so.

I did some market research and spoke to some people who knew a bit about this market. But I deliberately didn’t look closely at any competing products, as I didn’t want to be mentally restricted by what others have done. For better or worse, I want to blaze my own trail. Copying other people’s stuff is a zero-sum game with no net benefit to society.

Most of the things that Easy Data Transform you can do, you can also do in Excel or SQL. My claim is that it is much quicker, easier and less error prone to do in Easy Data Transform. No programming or scripting required. I am hoping that people will be able to start using it within a couple of minutes of downloading it (I plan to do lots of usability testing). Will people pay for that? I hope so. I’m not aiming it at programmers. Perish the thought.

Naming is hard. I came up with some 70 names. Things like ‘Data Hero’, ‘Transform Flow’, ‘Transmogrify’ and ‘Data Rapture’. But the domains were taken, people I asked hated them or there was an existing service or product with that name. So I ended up with Easy Data Transform. It does what it says on the tin.

Why desktop? Surely no-one is writing new desktop apps in 2019? I believe a desktop solution has some real advantages in this market. The biggest ones are:

  • You don’t need to load your (potentially highly sensitive) data on to a third party server.
  • Not having to upload and download (potentially very large) data sets makes it much more responsive.

Easy Data Transform is currently free for anyone to use. You can get it from the super-minimalist easydatatransform.com website. The current 0.9.0 version expires on the 4th August 2019. You will then be able to get another free version. Once the product is mature enough, and if I am convinced there is enough demand, I will release a paid version. The free beta will probably last several months. Please try it and let me know how you get on. I am particularly interested to get feedback from anyone using it for real day-to-day tasks.

Of course the real challenge is always marketing. How to get noticed amongst many competing products. As well as helping to improve the product I am hoping that this extended beta will also help me to get some traction and better understand the market. For example, what price to charge and what trial model to use. Watch this space.