Author Archives: Andy Brice

Easy Data Transform v1.0.0 released

v100-screen-cap

I finally released a paid version of Easy Data Transform today, for both Windows and Mac. I am very pleased with how it has turned out. Obviously it is only v1.0.0, so there is plenty of additional features I could add, including:

  • Batch processing
  • Support for JSON, XML, SQLite input/output
  • More transforms
  • A 64 bit version for Windows
  • A Linux version

But I need to listen carefully to prospective customers to decide which additional features to prioritize in future releases. It might be something I haven’t even thought of.

But v1.0.0 already has a really useful core of features. And, if you aren’t embarrassed by v1.0, you didn’t release it early enough. That said, I haven’t cut corners on quality. It has proper documentation and has been through extended beta testing, dogfooding and several rounds of usability and third party testing.

The product has a fully-functional 7 (non-consecutive) day free trial. I think that is enough for prospective customers to decide if it does what they need. I also have a 60 day money-back guarantee.

I have decided to go with a subscription model: $99 / €90 / £75 + tax per person per year. Which covers up to 3 computers. At this price point I can afford some paid promotion and to provide a decent level of support. I am not offering a monthly subscription, as I don’t really want people who are going to pay for 1 month (to do their annual TPS reports) and then cancel.

Have you got some data you need to merge, clean, reformat or de-dupe? Give it a try. You can get a 25% discount if you buy a subscription by the 27th December 2019 using this link.

 

Eating my own dogfood

Eating your own dogfood There is a story that a president of a pet food company ate some of his own dog food, to show how good it was. I’m not sure how tasty dog food really needs to be, given that dogs are happy to lick their own backsides. But his commitment is admirable. The least we can do as software developers is to use our own software as much as possible. After all, if you don’t use it, how can you expect anyone else to?

In that spirit I have been using my new Easy Data Transform product as much as possible. The biggest project so far has been merging two databases, for a charity that I volunteer at. I created an Airtable database for the charity. But volunteer information was already in a separate CRM. I imported relevant CRM data into Airtable, but the CRM system remained in use for emailing volunteers for a couple of years while I concentrated on Airtable and other tasks. In that time the Airtable database has become a roaring success for the charity. So we eventually decided to retire the CRM system and also use Airtable as our CRM.

Consequently I had to merge the latest CRM data into Airtable. I exported the relevant data from each as a CSV and then proceeded to merge the mailing list tags from the CRM into a new column in Airtable. I also created tables of discrepancies for the charity staff to work through. For example, where the telephone numbers or emails had been added or updated in one database, but not the other.

When I had initially imported the CRM data into Airtable, I had imported the CRM ID record. So those records were easy to match between Airtable and the CRM using a simple join on the ID. However any records added subsequently to Airtable or the CRM did not have matching IDs. So I had to match those by first name + last name or email address. The data was quite ‘dirty’, as is invariably the case with real world data. A phone number may be “0123 456 789” on one system and “01 23456789” on another. A volunteer might be “Chris” in one database and “Christopher” in another. Also some contacts had multiple entries in the CRM system. So this was not a trivial problem.

dogfood.png

You can get an idea of what was involved from the screenshot above. The two pink input nodes are the 2 databases exported as CSV files, the blue nodes are various transforms (joining, filtering, removing spaces etc) and the green nodes are the outputs (e.g. lists of telephone and email differences, lists of people in one database, but not the other etc). Quite a lot of the transforms are just column renames (in future I should probably support renaming multiple columns in one transform).

I think this would have been a horrific task using Excel, SQL, Beyond Compare or any of the other tools I had to hand, amazing tools as they may be for other tasks. But Easy Data Transform performed brilliantly, even if I do say so myself. It was particularly helpful that you could see the whole process step-by-step and backtrack or branch at any point without losing previous changes.

While eating my own dogfood, I found one bug (related to carriage returns inside CSV records) and quite a few minor annoyances. These have now been fixed in the latest release. I also added  a new ‘Compare Columns’ transform, which was really useful for this sort of work. So it was a very useful experience and I really recommend ‘eating your own dogfood’ as much as you can, along with usability testing.

Have you got some data that needs cleaning, merging, de-duping or filtering? Analytics, log files, emailing lists, databases? Of course you do! Why not give Easy Data Transform a try. It is free while it is in beta. Let me know how you get on.

 

 

 

The Hacker News effect – wide but not deep

I posted a “Show HN” link to Hacker News on Saturday about my new product, Easy Data Transform. For some hours, nothing happened. No upvotes, no comments. One of the moderators emailed me suggesting I add my own comment. I did that and it started to get some upvotes and other comments. On Sunday I made it onto the front page of Hacker News and stayed there most of the day. The traffic to my new easydatatransform.com website jumped from negligible to around 300 page views per hour. Cool!

hacker-news-effect1

In total I got 4,400 unique page views and the discussion generated some 59 comments (including replies by me). Among the predictable ‘why isn’t a web app?’ and ‘why isn’t it on Linux’ questions, there was some useful feedback. But I was quite surprised how little most visitors engaged beyond that:

Visits to the home page: 3,289

Visits to the download page: 192

Installs of the free beta: 20

Signups to the mailing list: 0

I was also surprised that over 70% of the hits were from mobile devices and tablets. I guess that might partly account for the low download rate (Easy Data Transform is only available on Windows and Mac).

hackernews-os

But the session duration histogram tells it’s own story.

hackernew-analytics.png

0-10 seconds. Ouch.

 

 

 

Easy Data Transform video

I am continuing to work on my new product, Easy Data Transform. I have thrown together a 3 minute video to give an idea of what it is capable of.

Easy Data Transform video

You can download Easy Data Transform for Windows or Mac here. The beta is completely free, but time-limited (I do plan to start charging at some point). Please try it and let me know what you think.

I don’t really enjoy doing screencasts or voiceovers. If you can recommend someone who does slick screencasts and voiceovers (or who can polish my amateur attempts), please feel free to give them a plug in the comments.

Easy Data Transform

I have been furiously coding a new product. Easy Data Transform. It is a Windows and Mac tool for transforming table and list data from one form to another. Joining, splitting, reformatting, filtering, sorting etc.

easydatatransform

I have been thinking about this product idea for years. In fact I threw together a janky prototype back in 2008. It allows you to perform various operations on a pair of lists.

list-weaver

I used this prototype for jobs such as creating a list of emails of people who had bought Perfect Table Plan v5, but hadn’t upgraded to v6 yet. It worked. But it wasn’t very good. The biggest annoyance was that each operation obliterated everything that came before. Which made it very easy to lose track of where you had got to. And there was no repeatability. It was also limited to lists and it became clear that I really needed something that could also handle tabular data. I never released it.

But the idea has been running as a background process in my brain for 11 years since. And I think I have come up with a much better design in that time. Finally I had mature, stable versions of my Perfect Table Plan and Hyper Plan products out, so I decided to go for it. I am really pleased with how it has turned out so far.

If you aren’t embarrassed by v1.0 you didn’t release it early enough. And so I have cut lots of corners to get this first public version out. The documentation is only part written. I created the application icon myself  in 10 minutes. There is no licensing. The GUI is lacking polish. The website would make a designer cry. But the software seems fairly robust. My 13 year old son wasn’t able to crash it after 10 minutes of trying, despite financial incentives to do so.

I did some market research and spoke to some people who knew a bit about this market. But I deliberately didn’t look closely at any competing products, as I didn’t want to be mentally restricted by what others have done. For better or worse, I want to blaze my own trail. Copying other people’s stuff is a zero-sum game with no net benefit to society.

Most of the things that Easy Data Transform you can do, you can also do in Excel or SQL. My claim is that it is much quicker, easier and less error prone to do in Easy Data Transform. No programming or scripting required. I am hoping that people will be able to start using it within a couple of minutes of downloading it (I plan to do lots of usability testing). Will people pay for that? I hope so. I’m not aiming it at programmers. Perish the thought.

Naming is hard. I came up with some 70 names. Things like ‘Data Hero’, ‘Transform Flow’, ‘Transmogrify’ and ‘Data Rapture’. But the domains were taken, people I asked hated them or there was an existing service or product with that name. So I ended up with Easy Data Transform. It does what it says on the tin.

Why desktop? Surely no-one is writing new desktop apps in 2019? I believe a desktop solution has some real advantages in this market. The biggest ones are:

  • You don’t need to load your (potentially highly sensitive) data on to a third party server.
  • Not having to upload and download (potentially very large) data sets makes it much more responsive.

Easy Data Transform is currently free for anyone to use. You can get it from the super-minimalist easydatatransform.com website. The current 0.9.0 version expires on the 4th August 2019. You will then be able to get another free version. Once the product is mature enough, and if I am convinced there is enough demand, I will release a paid version. The free beta will probably last several months. Please try it and let me know how you get on. I am particularly interested to get feedback from anyone using it for real day-to-day tasks.

Of course the real challenge is always marketing. How to get noticed amongst many competing products. As well as helping to improve the product I am hoping that this extended beta will also help me to get some traction and better understand the market. For example, what price to charge and what trial model to use. Watch this space.

Bloviate

I wondered what it would look like if you took a body of text and then used it to generate new text, using Markov chains of different lengths. So I knocked up  quick program to try it.  ‘Bloviate’.

bloviate

Bloviate analyses your source text to find every sequence of N characters and then works out the frequency of characters that come next.

For example, if you set N=3 and your source text contains the following character sequences staring with ‘the’:

‘the ‘, ‘then’, ‘they’, ‘the ‘

Then ‘the’ should be followed 50% of the time by a space, 25% of the time by an ‘n’ and 25% of the time by a ‘y’.

Bloviate then creates output text, starting with the first N characters of the source text and filling in the rest randomly using the same sequence frequencies as the source text.

Note that a character is a character to Bloviate. It treats upper and lower case as different characters, makes no attempt to differentiate between letters, punctuation and white space and does not attempt to clean up the source text. Which also means it works on any language.

Bloviate also tells you the average number of different characters following each unique sequence of N, which I will call F here. As F approaches 1.0 the output text becomes closer and closer to the input text.

Using ‘Goldilocks and the 3 bears’ as input:

If N=1 (F=7.05) the output is garbage. Albeit garbage with the same character pair frequency as the original.

On cre She sl s ramy raked cheais Bus ore than s sherd up m. ged. bend staireomest p!”Sof ckstirigrorr a ry ps.

” f waine tind s aso Sowa t antthee aime bupis stht stooomed pie k is beche p!

At N=3 (F=1.44) it looks close to English, but jibberish:

Once up and been sight,” she this timed. Pretty so soon, she second soft. She screame up and she screame hot!” cried the Mama bed the Papa been sleeping in the Papa bear

“Someone’s bear growl.

At N=5 (F=1.14) it starts to look like proper English, but semantically weird:

Once upon a time, so she went for a walked right,” she lay down into the kitchen, Goldilocks sat in the porridge from the three chair,” growled, “Someone’s been sitting my porridge and she tasted the door, and ran down the bedroom. Goldilocks woke up and she second bowl.

And it comes out with occasional gems such as:

“Someone’s been sitting my porridge,” said the bedroom.

At N=10 (F=1.03) it starts to become reasonably coherent:

Once upon a time, there was a little tired. So, she walked into the forest. Pretty soon, she came upon a house. She knocked and, when no one answered, she walked right in.

At the table in the kitchen, there were three bowls of porridge.

At N=15 (F=1.01) it starts to get pretty close to the original text, but doesn’t follow quite the same order:

Once upon a time, there was a little girl named Goldilocks. She went for a walk in the forest. Pretty soon, she came upon a house. She knocked and, when no one answered, she walked right in.

At the table in the kitchen, there were three bowls of porridge. Goldilocks was very tired by this time, so she went upstairs to the bedroom. She lay down in the first bed, but it was too hard. Then she lay down in the third bed and it was just right. Goldilocks fell asleep.

At N=12 (F=1.07) the whole 680k characters of ‘Pride and prejudice’ produces:

It is a truth universally contradict it. Besides, there was a motive within her of goodwill which could not help saying:

“Oh, that my dear mother had more command over herself! She can have her own way.”

As she spoke she observed him looking at her earnest desire for their folly or their vice. He was fond of them.”

Obviously the source text is important. The Bohemian Rhapsody lyrics make nearly as much (or as little sense) at N=5 (F=1.08) as the original:

Is this to me, for me, to me

Mama, just a poor boy from this to me

Any way the truth

Mama, life? Is this time tomorrow

Carry on as if nothing all behind and face the truth

Mama, ooh, didn’t mean to me, baby!

Just gotta leave me and lightning, very fright out, just killed a man

Put a gun against his head

Pulled my time to die?

At N=12 (F=1.05) 160k characters of Trump election speeches produces:

Hillary brought death and disaster to Iraq, Syria and Libya, she empowered Iran, and she unleashed ISIS. Now she wants to raise your taxes very substantially. Highest taxed nation in the world is a tenant of mine in Manhattan, so many great people. These are people that have been stolen, stolen by either very stupid politicians ask me the question, how are you going to get rid of all the emails?” “Yes, ma’am, they’re gonna stay in this country blind. My contract with the American voter begins with a plan to end government that will not protect its people is a government corruption at the State Department of Justice is trying as hard as they can to protect religious liberty;

Supply your own joke.

I knocked together Bloviate in C++/Qt in a couple of hours, so it is far from commercial quality. But it is fairly robust, runs on Windows and Mac and can rewrite the whole of ‘Pride and prejudice’ in a few seconds. The core of Bloviate is just a map of the frequency of characters mapped to the character sequence they follow:

QMap< QString, QMap< QChar, int > >

You can get the Windows binaries here (~8MB, should work from Windows 7 onwards).

You can get the Mac binaries here (~11MB, should work from macOS 10.12 onwards).

Note that the Bloviate executable is tiny compared to the Qt library files. I could have tried to reduce the size of the downloads, but I didn’t.

To use Bloviate just:

  1. paste your source text in the left pane
  2. set the sequence length
  3. press the ‘Go >’ button

I included some source text files in the downloads.

You can get the source for Bloviate here (~1MB).

It should build on Qt 4 or 5 and is licensed as creative commons. If you modify it, just give me an attribution and send me a link to anything interesting you come up with.

How to notarize your software on macOS

** Please note: WordPress keeps mangling my code examples by changing double dash to single dash. I tried to fix it. But it changed them all back again! If anyone knows how to get around this, please put something in the comments. **

Apple now wants you to ‘notarize’ your software. This is a process where you upload your software to Apple’s server so it can be scanned and certified malware free. This will probably become compulsory at some point, even (especially?) if your software isn’t in the Apple app store. Apple says:

Give users even more confidence in your software by submitting it to Apple to be notarized. The service automatically scans your Developer ID-signed software and performs security checks. When it’s ready to export for distribution, a ticket is attached to your software to let Gatekeeper know it’s been notarized.

When users on macOS Mojave first open a notarized app, installer package, or disk image, they’ll see a more streamlined Gatekeeper dialog and have confidence that it is not known malware.

Note that in an upcoming release of macOS, Gatekeeper will require Developer ID signed software to be notarized by Apple.

Documentation on notarization is a bit thin on the ground, especially if you want to notarize software that wasn’t built using XCode (I build my software using QtCreator). So I am writing up my experiences here.

First you need to ensure you have macOS 10.14 and XCode 10 installed (with command line tools) and you need a current Apple developer account.

Codesign your app with ‘hardened runtime’ using –options runtime :

codesign –deep –force –verify –verbose –sign “Developer ID Application:<developer id>” –options runtime <app file>

E.g.:

codesign –deep –force –verify –verbose –sign “Developer ID Application: Acme Ltd” –options runtime myApp.app

A ‘hardened runtime’ limits the data and resourced an application can access. I’m not sure what the exact ramification of this are. But it doesn’t seem to have restrict my software from doing anything it could do previously.

You can check the signing with:

codesign –verify –verbose=4 <app file>

E.g.:

codesign –verify –verbose=4 myApp.app

Now package your app into a .dmg (e.g. using DropDMG). Then upload the .dmg to Apple’s servers:

xcrun altool -t osx -f <dmg file> –primary-bundle-id <bundle id> –notarize-app –username <username>

E.g.:

xcrun altool -t osx -f myApp.dmg –primary-bundle-id com.acme.myapp –notarize-app –username me@acme.com

You will be prompted for your Apple developer password (or you can include it on the command line).

You now have to wait a few minutes. If the upload is successful “No errors uploading ” will be shown and a unique ID will be returned. You then have to use this to request your upload be scanned:

xcrun altool –notarization-info <notarize ID> -u <username>

E.g.:

xcrun altool –notarization-info xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx -u me@acme.com

You will be prompted for your Apple developer password (or you can include it on the command line).

Hopefully you will see “Status Message: Package Approved”. If the notarization fails, you should be sent a link to an online log file describing the issue. If the notarization completes successfully you need to ‘staple’ the results to your .dmg:

xcrun stapler staple -v <dmg file>

E.g.:

xcrun stapler staple -v myApp.dmg

The stapler outputs a log including some odd phrases. Mine included: “Humanity must endure”, “Let’s see how that works out. “, “Adding 1 blobs to superblob. What about Blob?” and “Enjoy”. Weird. Hopefully it will end with “The staple and validate action worked!”.

Finally you can unpack your .dmg into a .app and verify it with:

spctl -a -v <app file>

E.g.

spctl -a -v /Applications/myApp.app

On macOS 10.14 (but not earlier OSs) it should say “source=Notarized Developer ID”. Your software should now run on 10.14 without a warning dialog. Congratulations!

It all seems rather clumsy. As you have to wait asynchronously for the unique ID to be returned from step 1 before you can complete step 2, it is not easy to fully automate in a script. This is a major pain the arse. If anyone works out a way to automate it the whole process, please let me know.

Here are some links to the various posts that I gleaned this information from:

https://cycling74.com/forums/apple-notarizing-for-mojave-10-14-and-beyond
https://www.mbsplugins.de/archive/2018-11-02/Notarize_apps_for_MacOS
https://forum.xojo.com/50655-how-to-codesign-and-notarise-your-app-for-macos-10-14-and-highe
https://forum.xojo.com/49408-10-14-hardened-runtime-and-app-notarization/11
https://stackoverflow.com/questions/53112078/how-to-upload-dmg-file-for-notarization-in-xcode
https://lapcatsoftware.com/articles/debugging-mojave.html