Winterfest 2023 is on. Loads of quality software for Mac and Windows from independent vendors, at a discount. This includes my own Easy Data Transform and Hyper Plan, which are on sale with a 25% discount.
This sounds like a question a programmer might ask after one medicinal cigarette too many. The computer science equivalent of “what is the sounds of one hand clapping?”. But it is a question I have to decide the answer to.
I am adding indexOf() and lastIndexOf() operations to the Calculate transform of my data wrangling (ETL) software (Easy Data Transform). This will allow users to find the offset of one string inside another, counting from the start or the end of the string. Easy Data Transform is written in C++ and uses the Qt QString class for strings. There are indexOf() and lastIndexOf() methods for QString, so I thought this would be an easy job to wrap that functionality. Maybe 15 minutes to program it, write a test case and document it.
Obviously it wasn’t that easy, otherwise I couldn’t be writing this blog post.
First of all, what is the index of “a” in “abc”? 0, obviously. QString( “abc” ).indexOf( “a” ) returns 0. Duh. Well only if you are a (non-Fortran) programmer. Ask a non-programmer (such as my wife) and they will say: 1, obviously. It is the first character. Duh. Excel FIND( “a”, “abc” ) returns 1.
Ok, most of my customers, aren’t programmers. I can use 1 based indexing.
But then things get more tricky.
What is the index of an empty string in “abc”? 1 maybe, using 1-based indexing or maybe empty is not a valid value to pass.
What is the index of an empty string in an empty string? Hmm. I guess the empty string does contain an empty string, but at what index? 1 maybe, using 1-based indexing, except there isn’t a first position in the string. Again, maybe empty is not a valid value to pass.
I looked at the Qt C++ QString, Javascript string and Excel FIND() function for answers. But they each give different answers and some of them aren’t even internally consistent. This is a simple comparison of the first index or last index of text v1 in text v2 in each (Excel doesn’t have an equivalent of lastIndexOf() that I am aware of):
Changing these to make the all the valid results 1-based and setting invalid results to -1, for easy comparison:
So:
Javascript disagrees with C++ QString and Excel on whether the first index of an empty string in an empty string is valid.
Javascript disagrees with C++ QString on whether the last index of an empty string in a non-empty string is the index of the last character or 1 after the last character.
C++ QString thinks the first index of an empty string in an empty string is the first character, but the last index of an empty string in an empty string is invalid.
It seems surprisingly difficult to come up with something intuitive and consistent! I think I am probably going to return an error message if either or both values are empty. This seems to me to be the only unambiguous and consistent approach.
I could return a 0 for a non-match or when one or both values are empty, but I think it is important to return different results in these 2 different cases. Also, not found and invalid feel qualitatively different to a calculated index to me, so shouldn’t be just another number. What do you think?
*** Update 14-Dec-2023 ***
I’ve been around the houses a bit more following feedback on this blog, the Easy Data Transform forum and hacker news and this what I have decided:
IndexOf() v1 in v2:
v1
v2
IndexOf(v1,v2)
1
aba
aba
1
a
a
1
a
aba
1
x
y
world
hello world
7
This is the same as Excel FIND() and differs from Javascript indexOf() (ignoring the difference in 0 or 1 based indexing) only for “”.indexOf(“”) which returns -1 in Javascript.
LastIndexOf() v1 in v2:
v1
v2
LastIndexOf(v1,v2)
1
aba
aba
4
a
a
1
a
aba
3
x
y
world
hello world
7
This differs from Javascript lastIndexOf() (ignoring difference in 0 or 1 based indexing) only for “”.indexOf(“”) which returns -1 in Javascript.
Conceptually the index is the 1-based index of the first (IndexOf) or last (LastIndexOf) position where, if the V1 is removed from the found position, it would have to be re-inserted in order to revert to V2. Thanks to layer8 on Hacker News for clarifying this.
Javascript and C++ QString return an integer and both use -1 as a placeholder value. But Easy Data Transform is returning a string (that can be interpreted as a number, depending on the transform) so we aren’t bound to using a numeric value. So I have left it blank where there is no valid result.
Now I’ve spent enough time down this rabbit hole and need to get on with something else! If you don’t like it you can always add an If with Calculate or use a Javascript transform to get the result you prefer.
*** Update 15-Dec-2023 ***
Quite a bit of debate on this topic on Hacker News.
Summerfest 2023 is on. Loads of quality software for Mac and Windows from independent vendors, at a discount. This includes my own Easy Data Transform and Hyper Plan, which are on sale with a 25% discount.
This is a guest post from fellow software developer, Simon Kravis.
It’s sometimes said that software development is only 10% of what’s required to earn money from software and I can attest to that. Since 2018 I have been developing photo captioning and related software, more as a retirement diversion than a serious source of income (after a career mostly involved in writing scientific and engineering analysis software), in the hope that sales income would at least cover running costs. My best marketing tool has been writing reviews of the class of software that I produce, and the hosting site (Hub Pages) provides some useful analytics on how often these are accessed and for how long. Below is the graph for an article on tagging.
The decline since early 2022 is hard to explain – the article is periodically updated so the steady decline is not due to diminishing ‘freshness’ – which for Google is probably a file Modified date.
Here is another review article profile (Scanning Multiple Photos) showing a similar decline:
But another (Best Photo Captioning Software) has held up, though at a low level.
I offer digital photo captioning software (Caption Pro) on Windows and Mac platforms, and an iPhone captioning app (CaptionEdit), with the Windows version dating back to 2017. I also offer part of the functionality of Caption Pro on Windows for auto-cropping scans of multiple paper photos (ImageSplit). On Windows neither Caption Pro software downloads or sales seem to correlate with review accesses, despite about 1/3 of web site accesses coming from the review. However, downloads do show some correlation with Caption Pro web site sessions, as shown in the graph below.
Sales do not correlate with downloads, which perhaps explains why most advertising for niche products is not successful – it may increase downloads but this does not appear to increase sales. The observed proportion of downloads resulting in sales for ImageSplit and Caption Pro are 6% and 9% respectively. The lack of correlation between sales and downloads may be due to the small number of sales per month, which results in random fluctuation dominating the results.
The decision to enter the Apple “Walled Garden” of software was partly at the prompting of friends rather than a commercial evaluation. Apple Developer membership (costing ~US$100 per year) is required to prevent software being blocked from installation through being from an unknown publisher. Further costs were purchasing a fairly modern Mac on which to perform development, as the App Store will only accept software developed using recent versions of the Xcode development environment, which will only run on fairly recent hardware. The App Store takes a commission of 15% on sales, which is quite reasonable when compared to the difficulty of implementing e-commerce on Windows, where a PayPal account eases the problem of low-value foreign-currency transactions, but e-commerce plug-ins may stop working after years for no discernible reason. The review process for software acceptance into the App Store is generally fast, but seemly trivial issues can require resubmission. Features which have passed one review may be rejected in a later one. The review process is generally fast, but on one occasion took 4 weeks.
Caption Pro for Mac has been available (via the App Store) only since Sep 2021.It appears within the top 6 results for a search using “Caption Photos”, which is the source for most downloads. About 3.5% of downloads result in sales. This figure is much less than the Windows version of the same app, despite Mac users’ reputation for being more willing to pay for software. The iPhone app did not appear at all initially when searching for “Caption Photos” in the App Store. After 6 months it began appearing as result number 140, after it had 360 downloads. This poor ranking performance is probably because “Caption Photos” is a very popular keyword used by many apps, including those that only caption videos. It has had very few downloads and sales, despite Apple Search Ads and Apptimizer campaigns. The number of downloads increased dramatically during the Apptimizer campaign between Jan 24 and Feb 2 (as they were purchased) but the change in ranking from these downloads did not result in any sales, perhaps because no installs were purchased. The Apple search ads campaign (which resulted in the app being shown as an ad 1 in 50 times when the search phrase “Caption Photos” was used) did not greatly affect downloads or sales. A Facebook ad campaign to show a link to the app whenever “Genealogy” or “Genealogy Software” was searched for was also unsuccessful, and very expensive, as Facebook charges by impressions rather than clicks. Additional backlinks to the web site were purchased in September 2022 from Links Management in an attempt to improve the web site Google ranking, but this did not appear to have any effect on web traffic.
Mac and Windows users contacting me with problems have had a wide range of experience level – from completely naïve to former programmers. Most have been from the US, which reflects the geographic distribution of sales. There have many downloads to non-English speaking countries but very few sales.
Some results from the Mac and iPhone Apps are shown below:
On balance, developing for Apple platforms was not a good commercial decision, as the advantages of a mostly captive audience (completely captive in the case of the iPhone) do not seem to result in higher rates of downloads or sales. Competition for iPhone apps is so intense that niche products without massive advertising budgets are unlikely to succeed. The same is likely to apply to Android phone apps, which anecdotally have a less rigorous review process. My experience is that advertising and backlink purchase for any platform are not effective in increasing sales for niche software.
Simon Kravis runs Aleka Consulting, a small software and consultancy company in Canberra, Australia specializing in information management and offering a number of software products. He has mainly developed scientific and engineering programs, starting in the era of paper tape.
I have been using altool to notarize my Mac apps for some years. However Apple, being Apple, have deprecated altool in favour of the new notarytool. altool will stop working at some point in 2023. And Apple, being Apple, have made little attempt to keep consistency between the two.
I didn’t find anything online to tell me how arguments between the two tools related. Consequently I spent a while trying to guess which arguments mapped to which. I got locked out for a while for trying to wrong combination too many times. In the end I went from this:
On the plus side the --wait option doesn’t exit until the notarization is complete, which means you can easily do you whole build, sign and notarize process in a single script. Hoorah.
Note that you still need to run the ‘stapling’ step after notarization:
Back in 2011 I created eventcountdown.com. It had a snazzy downloadable, PerfectTablePlan-branded countdown clock for Windows and web-based countdown clock with ads for PerfectTablePlan. Both free. The idea was people searching for countdown clocks for events (such as their wedding) would find the site via Google, find out about PerfectTablePlan and a certain percentage would then buy my event seating planner software.
I paid other people to create the Windows and web versions of the countdown clock. The web-based clock was updated from time to time to add pre-built countdowns for events like superbowl, the olympics, christmas, thanksgiving etc. And I fielded the occasional support emails related to the Windows countdown clock.
This is the total traffic to the site from 2011 to 2023:
The peaks are mostly due to superbowl. The site got 38k hits in a single day just before superbowl 2019! The free Windows countdown clock also drew quite a lot of traffic. In total the site got some 1.7 million page views over 12 years. Only a small percentage of these visitors clicked through to PerfectTablePlan.com, but still a useful number. Perhaps some people were also prompted to investigate PerfectTablePlan by the branding on the downloadable clock. The site might have also had some SEO benefits for PerfectTablePlan.com. Who knows.
The eventcountdown.com website is now gone (the domain redirects to PerfectTablePlan.com). It didn’t seem worth the effort to keep adding events to the web countdown clock with the traffic now so low. Also both the website and windows clock were looking dated. But I think it was a worthwhile investment of my time and money.
I have also created various other contents pages and mini-sites over the years: articles on table planning, font collections, free clipart, place card templates etc. You can see similar trajectories for some of those.
The traffic seems to reach a peak after 3-7 years and then slowly decay away. Although I have shown them with the same vertical scale here, some generated a lot more traffic than others.
I did some basic on-page SEO for these content pages. For example, looking at Adwords keyword data to choose the page title and H1. But nothing beyond that. No paid promotion or backlink building campaigns.
I tried paying people to write articles related to events. But none of these ever generated any worthwhile traffic. Google could somehow smell the insincerity.
For my data cleaning software product I have been concentrating on ‘how to’ pages and supporting videos aimed at specific topics. These are intended to both help existing customers and to attract new traffic. For example, how to clean data. I have also been posting these videos on the Easy Data Transform YouTube channel. The numbers of hits monthly on the Youtube videos are relatively low, but they are quite targeted and hopefully will be generating traffic for years to come.
So content marketing take-aways based on my experience are:
Free content can be a useful way to bring free traffic to your website.
The amount of traffic you get is quite hit and miss. Some content has generated a lot more traffic than expected, some a lot less.
The content needs to be well targeted if you want to have any chance of converting it to sales.
Google will grow bored of it eventually. You might be able to increase the longevity by updating the content. I’ve not been very diligent with this, but even neglected content pages can generate useful traffic over 10+ year lifespan.
The authenticode digital certificate I bought back in 2019 expired recently, so I had to get a new certificate (you can’t renew a certificate, as such, you just need to buy a new one). A few months before the expiry I emailed KSoftware.net, who I had bought previous digital certificates from and with whom I had always had a good experience in the past. No reply. I tried a couple more times, including the personal email of Mitchell, the founder. Nothing. Someone else told me they had had similar experiences. Their recent trustpilot ratings are a horror show. And the copyright date on their website is ‘2003 – 2021’. But they were still advertising on Google Adwords. I have no idea what is happening here. If you are reading this Mitchell, I hope you are ok.
With KSoftware out of the picture I looked elsewhere. Eventually I ended up buying a new Sectigo certificate from signmycode.com. I partly chose them because they offered a 5 year certficate and the less often I have to go through the ball ache of a getting a new certificate, the better. The experience was decidely mixed.
The good:
The prices seem reasonable, compared to other options.
Support was responsive. English didn’t appear to be their first language, but it was good enough.
I got my new certificate within a few days and have had no issues with it so far. The change in certificate seems to be set off a few customer’s anti-virus software, but that was to be expected.
The mediocre:
The online guidance and documentation on the process was mediocre, at best.
I was a bit confused about whether I had to click ‘Buy now’ or ‘Renew now’. It seems this is more marketing/SEO purposes and it doesn’t matter which you click.
I had to send them a photo of me holding a government ID. This felt pretty uncomfortable, but might be something mandated by the certificate companies.
The bad:
After I got my certificate I checked the expiry date and it was only 3 years. When I queried this I was rold that the ‘5 year certificate’ I thought I had bought is not a 5 year certificate. It is a 3 year certificate, then I have to apply for a new pre-paid 2 year certificate in 3 years time.
This is what you see when you click on ‘Buy now’:
When you see this, wouldn’t you expect to get a single 5 year certificate? If there was anything explaining that this was 2 separate certificates, I didn’t notice it. It certainly didn’t mention it on their home page. This feels deceptive to me.
Who knows if this company will still be there in 3 years time? I emailed them and told them I wanted to keep the new 3 year certificate and for them to refund the 2 year certificate. They said they would only refund the entire order and then I would have to start the whole process all over again. They also claimed:
“renewal validation is much more easy then buying a new certificate as most of the validation part is getting carry forward.”
You need your hosting company, payment processor and other critical suppliers to be rock solid. Think twice about going with a supplier just because they are cheap. Changing suppliers can be a pain, so ask around before trying a supplier.
Use version control for everything important
It matters less which version control system it is. Periodically making a copy of your source folder is not a version control system!
Don’t promise ship dates
Developers are notoriously bad at predicting dates. If you promise a date and get it wrong (and you will) then you either have to miss the date or cut corners. Neither is good.
Never send an email you might later regret
If you are starting to feel angry writing an email, then stop writing. Come back to it later. Or maybe write it, feel a bit better, then delete it without sending.
Write documentation as you go
Few people enjoy writing documentation. But if you leave all the documentation until you have finished programming, then you are likely to rush it and forget stuff.
Have a checklist
Automate where you can. Have checklists for everything else. Keep updating your checklists.
Get someone else to proof read everything
Typos are embarrassing, but it is impossible to proof read your own stuff. So get someone else to proof read any stuff that customers see: web pages, newsletters, documentation etc.
Never release changes just before going on holiday
You don’t want to have to be fire-fighting a new bug when you should be on the beach with your family/friends.
Don’t try to do everything yourself
You could spend weeks learning about taxes, web hosting, CSS or any number of other topics that aren’t central to your business. But why bother? Pay someone who already know this stuff.
Embrace imperfection
If you wait for perfection, then you are never going to ship anything. Just make sure each release is better than the last. Good enough is good enough.
I have been gradually improving my data wrangling tool, Easy Data Transform, putting out 70 public releases since 2019. While the product’s emphasis is on ease of use, rather than pure performance, I have been trying to make it fast as well, so it can cope with the multi-million row datasets customers like to throw at it. To see how I was doing, I did a simple benchmark of the most recent version of Easy Data Transform (v1.37.0) against several other desktop data wrangling tools. The benchmark did a read, sort, join and write of a 1 million row CSV file. I did the benchmarking on my Windows development PC and my Mac M1 laptop.
Here is an overview of the results:
Time by task (seconds), on Windows without Power Query (smaller is better):
I have left Excel Power Query off this graph, as it is so slow you can hardly see the other bars when it is included!
Time by task (seconds) on Mac (smaller is better):
Memory usage (MB), Windows vs Mac (smaller is better):
So Easy Data Transform is nearly as fast as it’s nearest competitor, Knime, on Windows and a fair bit faster on an M1 Mac. It is also uses a lot less memory than Knime. However we have got some way to go to catch up with the Pandas library for Python and the data.table package for R, when it comes to raw performance. Hopefully I can get nearer to their performance in time. I was forbidden from including benchmarks for Tableau Prep and Alteryx by their licensing terms, which seems unnecessarily restrictive.
Looking at just the Easy Data Transform results, it is interesting to notice that a newish Macbook Air M1 laptop is significantly faster than a desktop AMD Ryzen 7 desktop PC from a few years ago.