Category Archives: C++

Bloviate

I wondered what it would look like if you took a body of text and then used it to generate new text, using Markov chains of different lengths. So I knocked up  quick program to try it.  ‘Bloviate’.

bloviate

Bloviate analyses your source text to find every sequence of N characters and then works out the frequency of characters that come next.

For example, if you set N=3 and your source text contains the following character sequences staring with ‘the’:

‘the ‘, ‘then’, ‘they’, ‘the ‘

Then ‘the’ should be followed 50% of the time by a space, 25% of the time by an ‘n’ and 25% of the time by a ‘y’.

Bloviate then creates output text, starting with the first N characters of the source text and filling in the rest randomly using the same sequence frequencies as the source text.

Note that a character is a character to Bloviate. It treats upper and lower case as different characters, makes no attempt to differentiate between letters, punctuation and white space and does not attempt to clean up the source text. Which also means it works on any language.

Bloviate also tells you the average number of different characters following each unique sequence of N, which I will call F here. As F approaches 1.0 the output text becomes closer and closer to the input text.

Using ‘Goldilocks and the 3 bears’ as input:

If N=1 (F=7.05) the output is garbage. Albeit garbage with the same character pair frequency as the original.

On cre She sl s ramy raked cheais Bus ore than s sherd up m. ged. bend staireomest p!”Sof ckstirigrorr a ry ps.

” f waine tind s aso Sowa t antthee aime bupis stht stooomed pie k is beche p!

At N=3 (F=1.44) it looks close to English, but jibberish:

Once up and been sight,” she this timed. Pretty so soon, she second soft. She screame up and she screame hot!” cried the Mama bed the Papa been sleeping in the Papa bear

“Someone’s bear growl.

At N=5 (F=1.14) it starts to look like proper English, but semantically weird:

Once upon a time, so she went for a walked right,” she lay down into the kitchen, Goldilocks sat in the porridge from the three chair,” growled, “Someone’s been sitting my porridge and she tasted the door, and ran down the bedroom. Goldilocks woke up and she second bowl.

And it comes out with occasional gems such as:

“Someone’s been sitting my porridge,” said the bedroom.

At N=10 (F=1.03) it starts to become reasonably coherent:

Once upon a time, there was a little tired. So, she walked into the forest. Pretty soon, she came upon a house. She knocked and, when no one answered, she walked right in.

At the table in the kitchen, there were three bowls of porridge.

At N=15 (F=1.01) it starts to get pretty close to the original text, but doesn’t follow quite the same order:

Once upon a time, there was a little girl named Goldilocks. She went for a walk in the forest. Pretty soon, she came upon a house. She knocked and, when no one answered, she walked right in.

At the table in the kitchen, there were three bowls of porridge. Goldilocks was very tired by this time, so she went upstairs to the bedroom. She lay down in the first bed, but it was too hard. Then she lay down in the third bed and it was just right. Goldilocks fell asleep.

At N=12 (F=1.07) the whole 680k characters of ‘Pride and prejudice’ produces:

It is a truth universally contradict it. Besides, there was a motive within her of goodwill which could not help saying:

“Oh, that my dear mother had more command over herself! She can have her own way.”

As she spoke she observed him looking at her earnest desire for their folly or their vice. He was fond of them.”

Obviously the source text is important. The Bohemian Rhapsody lyrics make nearly as much (or as little sense) at N=5 (F=1.08) as the original:

Is this to me, for me, to me

Mama, just a poor boy from this to me

Any way the truth

Mama, life? Is this time tomorrow

Carry on as if nothing all behind and face the truth

Mama, ooh, didn’t mean to me, baby!

Just gotta leave me and lightning, very fright out, just killed a man

Put a gun against his head

Pulled my time to die?

At N=12 (F=1.05) 160k characters of Trump election speeches produces:

Hillary brought death and disaster to Iraq, Syria and Libya, she empowered Iran, and she unleashed ISIS. Now she wants to raise your taxes very substantially. Highest taxed nation in the world is a tenant of mine in Manhattan, so many great people. These are people that have been stolen, stolen by either very stupid politicians ask me the question, how are you going to get rid of all the emails?” “Yes, ma’am, they’re gonna stay in this country blind. My contract with the American voter begins with a plan to end government that will not protect its people is a government corruption at the State Department of Justice is trying as hard as they can to protect religious liberty;

Supply your own joke.

I knocked together Bloviate in C++/Qt in a couple of hours, so it is far from commercial quality. But it is fairly robust, runs on Windows and Mac and can rewrite the whole of ‘Pride and prejudice’ in a few seconds. The core of Bloviate is just a map of the frequency of characters mapped to the character sequence they follow:

QMap< QString, QMap< QChar, int > >

You can get the Windows binaries here (~8MB, should work from Windows 7 onwards).

You can get the Mac binaries here (~11MB, should work from macOS 10.12 onwards).

Note that the Bloviate executable is tiny compared to the Qt library files. I could have tried to reduce the size of the downloads, but I didn’t.

To use Bloviate just:

  1. paste your source text in the left pane
  2. set the sequence length
  3. press the ‘Go >’ button

I included some source text files in the downloads.

You can get the source for Bloviate here (~1MB).

It should build on Qt 4 or 5 and is licensed as creative commons. If you modify it, just give me an attribution and send me a link to anything interesting you come up with.

Getting Qt 5.9 working on Windows (eventually)

I have had Qt 5.5 and 5.6 installed on my development machines for some time. Now that I have purchased a new Mac development box (an iMac with a lickably beautiful 27″ screen) I thought it was a good time to update to a more recent version of Qt. I went for Qt 5.9, rather than Qt 5.10, as 5.9 has been designated as an LTS (long term support) release. Upgrading turned into a real chore. I am quickly writing it up here in the hope that it helps someone else, and as a reminder to myself a few years down the line.

I like to build Qt from source. Because then I know it was built using the same compiler, headers, SDK etc as I am using to build my product. And I have more control over how Qt is configured. Also I can patch the source and rebuild it, if I need to. But I have had problems building Qt on Mac before. So I decided to install the pre-built binaries on my new Mac. I installed the latest version of XCode and then the Q5.9.4 binaries. This was a couple of big downloads, but it all went pretty smoothly.

I successfully built Qt 5.5 from source on my Windows machine previously, so I decided to try that for Qt 5.9. I have Visual Studio 2010 installed. This isn’t supported for Qt 5.9.4, so I downloaded Visual Studio 2017. I unzipped the Qt source into C:\Qt\5.9.4, ran ‘x86 native tools command prompt for VS 2017’, made sure Python and Perl were in the path and then:

cd C:\Qt\5.9.4

set QTDIR=C:\Qt\5.9.4\qtbase

set PATH=%QTDIR%\bin;%PATH%

configure -opensource -confirm-license -opengl desktop -nomake tests -nomake examples -no-plugin-manifests -debug-and-release -platform win32-msvc -verbose

nmake

Note that you are told by the nmake script to do nmake install at the end of this. But it tells you somewhere in the Qt Windows documentation not to do this, unless you have set the prefix argument (confusing, I know)

The build failed part way through making qtwebengine. Something to do with a path being too long for Perl or Python (I forget). It seems to be a known problem. Odd as the root path was just C:\Qt\5.9.4. I don’t need qtwebengine at present, so I deleted everything and tried again with -skip qtwebengine:

configure -opensource -confirm-license -opengl desktop -skip qtwebengine -nomake tests -nomake examples -no-plugin-manifests -debug-and-release -platform win32-msvc -verbose

nmake

It seemed to complete ok this time. But using this version of Qt to build Hyper Plan I got an error:

Unknown module(s) in QT:svg

On further examination the SVG DLL  had been built, but hadn’t been copied to the C:\Qt\5.9.4\qtbase\bin folder. Similarly for a lot of the other Qt DLLs. I couldn’t find any obvious reason for this looking through logs, Stackoverflow and Googling. I could possibly do without the SVG functionality, but I wasn’t sure what else was broken. So I decided to give up on bulding from source on Windows as well.

I download the Qt 5.9.4 binaries for Visual Studio 2017. This seemed to go ok, but then I discovered that I could only build a 64-bit application from these. No 32-bit version was available for Visual Studio 2017. Many of my customers are still on 32 bit versions of Windows. So I need to be able to ship my product as a 32 bit executable + DLLs[1].

So I uninstalled Visual Studio 2017 and installed Visual Studio 2015. I then got an error message about Visual Studio 2017 redistributables that I hadn’t uninstalled. So I had to uninstall those and run a repair install on Visual Studio 2015. That seemed to work ok. So then I download the 32-bit Qt 5.9.4 binaries for Visual Studio 2015. I had to download these into a different top level folder (C:\Qtb), so as not to risk wiping existing Qt installs that I had previously managed to build from source.

Eventually I managed to build Hyper Plan and PerfectTablePlan on Mac and Windows. What a palaver though! Qt is an amazing framework and I am very grateful for everyone who works on it. But I wish they would make it a bit easier to install and upgrade! Has anyone actually managed to get Qt 5.9 built from source on Windows?

[1] I don’t bother shipping a 64-bit executable on Windows as the 32-bit executable works fine on 64-bit versions of Windows (my software doesn’t require excessive amounts of memory). I only ship a 64-bit executable on macOS as almost no-one uses 32-bit versions of macOS now.

How much code can a coder code?

Lines of code (LOC) is a simple way to measure programmer productivity. Admittedly it is a flawed metric. As Bill Gates famously said “Measuring programming progress by lines of code is like measuring aircraft building progress by weight”. But it is at least easy to measure.

So how much code do programmers average per day?

  • Fred Brooks claimed in ‘The Mythical Man-Month’ that programmers working on the OS/360 operating system averaged around 10 LOC per day.
  • Capers Jones measured productivity of around 16 to 38 LOC per day across a range of projects.
  • McConnell measured productivity of 20 to 125 LOC per day for small projects (10,000 LOC) through to 1.5 to 25 LOC per day for large projects (10,000,000 LOC).

It doesn’t sound a lot, does it? I’m sure I’ve written hundreds of lines of code on some days. I wondered how my productivity compared. So I did some digging through my own code. In the last 12 years I have written somewhere between 90,000 and 150,000 C++ LOC (depending on how you measure LOC) for my products: PerfectTablePlan, Hyper Plan and Keyword Funnel. This is an average of round 50 lines of code per working day. Not so different from the data above.

I also looked at how PerfectTablePlan LOC increased over time. I was expecting it to show a marked downward trend in productivity as the code base got bigger, as predicted by McConnell’s data. I was surprised to see that this was not the case, with LOC per day remaining pretty constant as the code base increased in size from 25k to 125k.

loc

Some qualifications:

  • I give a range for LOC because ‘line of code’ isn’t very well defined. Do you count only executable statements, or any lines of source that aren’t blank?
  • My data is based on the current sizes of the code bases. It doesn’t include all the code I have written and then deleted in the last 12 years. I have just spent several months rewriting large parts of PerfectTablePlan to work with the latest version of Qt, which involved deleting large swathes of code.
  • It doesn’t include automatically generated code (e.g. code generated by the Qt framework for user interfaces and signals/slots code).
  • I only counted code in products shipped to the user. I didn’t count code I wrote for licence generation, testing etc.
  • All the code is cross-platform for Windows and Mac, which makes it more time consuming to write, test and document.
  • Programming is only one of the things I do. As a one-man-band I also do marketing, sales, QA, support, documentation, newsletters, admin and nearly everything else. I have also done some consulting and training not directly related to my products in the last 12 years.

Given that I probably spend less than half my time developing (I have never measured the ratio), my productivity seems fairly good. I think a lot of this may be the fact that, as a solo developer, I don’t have to waste time with meetings, corporate bullshit and trying to understand other people’s code. Also writing desktop Windows/Mac applications in C++ is a lot easier than writing an new operating system with 1970s tools.

50 lines of code per day doesn’t sound like a lot. But the evidence is that you are unlikely to do much better on a substantial project over an extended period. Bear that in mind next time you estimate how long something is going to take.

If you have data on LOC per day for sizeable projects worked on over an extended period, please post it in the comments. It will only take you a few minutes to get the stats using Source Monitor (which is free).

 

 

 

 

 

 

Pretty printing C++ with Clang-Format

I use some of the code generation and refactoring tools in QtCreator. These save a lot of time, but they don’t format C++ code how I like it. For example they produce C++ code like this:

void MyClass::foo(int *x)

But I like my code formatted like this:

void MyClass::foo( int* x )

The differences may seem minor, but they are a source of significant irritation to me. I like my code how I like it, goddammit! And consistent formatting enhances readability. However re-formatting it by hand is time-consuming and tedious.

What I need is a tool that can enforce consistent formatting in the style that I like, or something close. I have tried to use automatic C++ formatting (pretty printing) tools in the past, but I couldn’t get them to produce a format that was close enough to what I wanted. But I have finally found the tool for the job. Clang-Format.

Clang-Format is part of the LLVM family of tools. It is a free, command-line tool that reformats C++, Objective-C or C according to the settings in a config file. As with many free tools, it isn’t terribly well documented. Some of the documentation on the web is out of date and some of it is incomplete. But I have managed to find out enough to configure it how I like it.

To run it you just need to place your options in a .clang-format file, make sure the clang-format executable is in the path and then run it:

clang-format.exe -i -style=file <C++ file>

Here are the settings I am currently using in my .clang-format file:

Language: Cpp
AccessModifierOffset: -4
AlignAfterOpenBracket: false
AlignConsecutiveAssignments: false
AlignConsecutiveDeclarations: false
AlignEscapedNewlinesLeft: false
AlignOperands: true
AlignTrailingComments: false
AllowAllParametersOfDeclarationOnNextLine: false
AllowShortBlocksOnASingleLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: Inline
AllowShortIfStatementsOnASingleLine: false
AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterDefinitionReturnType: None
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: false
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
  AfterClass:      true
  AfterControlStatement: true
  AfterEnum:       true
  AfterFunction:   true
  AfterNamespace:  true
  AfterObjCDeclaration: true
  AfterStruct:     true
  AfterUnion:      false
  BeforeCatch:     true
  BeforeElse:      true
  IndentBraces:    false
BreakBeforeBinaryOperators: None
BreakBeforeBraces: Allman
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
ColumnLimit: 0
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 0
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: false
DisableFormat: false
ExperimentalAutoDetectBinPacking: false
ForEachMacros: [ foreach, Q_FOREACH, BOOST_FOREACH ]
IndentCaseLabels: true
IndentWidth: 4
IndentWrappedFunctionNames: false
KeepEmptyLinesAtTheStartOfBlocks: true
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 2
NamespaceIndentation: None
PenaltyBreakBeforeFirstCallParameter: 100
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyExcessCharacter: 10000
PointerAlignment: Left
ReflowComments: true
SortIncludes: false
SpaceAfterCStyleCast: false
SpaceBeforeAssignmentOperators: true
SpaceBeforeParens: ControlStatements
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: true
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: true
SpacesInParentheses: true
SpacesInSquareBrackets: true
Standard: Cpp11
TabWidth: 4
UseTab: Never

It took me a few hours of fiddling with the settings to find the best combination. It would be really useful if someone could write a tool that would analyze your C++ code and create a .clang-format file for you. You would probably only want to do this once though, so I don’t think it has much potential as a commercial product.

There are only two things I couldn’t get quite right in the formatting:

  1. I couldn’t get it to add a blank line after public, protected and private declarations. I fixed this with a quick Perl hack (see below).
  2. I couldn’t get it to indent continuation lines how I would like (ideally indented 1 or 2 spaces from the first line). It is a small price to pay and I am just putting up with it for now.

Perhaps there are options to do these and I just didn’t find them.

Here is the Windows .bat script I used to format all the C++ files in a folder.

for %%f in (*.h *.cpp *.inl) do (
clang-format.exe -i -style=file %%f
)

for %%f in (*.h) do (
clang-format.exe -i -style=file %%f
perl -p -i.bak -e "s/public:/public:\n/g" %%f
perl -p -i.bak -e "s/protected:/protected:\n/g" %%f
perl -p -i.bak -e "s/private:/private:\n/g" %%f
perl -p -i.bak -e "s/    Q_OBJECT/Q_OBJECT/g" %%f
)

del *.bak
del *.tmp

No doubt there is a more elegant way to do the Perl, but it works.

I now just run this batch periodically to keep my code beautiful and consistent.

A code pretty printer product idea

I use QtCreator for my C++ development IDE. It is very good. But it doesn’t always lay out my C++ code the way I want it to. In particular the refactoring operations mess up my white space. I want my code to look like:

void foo( int* x, int& y )

But it keeps changing it to:

void foo(int *x, int &y)

Grrrrrr.

So I am constantly battling with QtCreator to layout my code how I like it. There are plenty of C++ pretty printers around. I played with some of the leading ones using UniversalIndentGUI, but I couldn’t get any of them to layout my code quite how I wanted. Maybe they could have done it, but I got fed up with fiddling with the settings.

What I need is a code pretty printer that I can configure to layout my code exactly how I want without me having to tweak 100 settings.

Ideally I want a code pretty printer that I can train. So I just point it at a few files of my code that are laid out how I want and it works out all my preferences (where to put braces, how to indent case statements, where to put the * for a pointer declaration etc) and can then apply them to any other code. It wouldn’t need a GUI. Or perhaps just enough to select the training files and then preview the results on other files.

I have no idea if this is a viable product idea. But I would pay $100 for it, if it worked well. Perhaps bigger software companies would pay a lot more? Or maybe something like this already exists and I just don’t know about it?

Cppcheck – A free static analyser for C and C++

I got a tip from Anna-Jayne Metcalfe of C++ and QA specialists Riverblade to check out Cppcheck, a free static analyser for C and C++. I ran >100 kLOC of PerfectTablePlan C++ through it and it picked up a few issues, including:

  • variables uninitialised in constructors
  • classes passed by value, rather than as a const reference
  • variables whose scopes could be reduced
  • methods that could be made const

It only took me a few minutes from downloading to getting results. And the results are a lot less noisy than lint. I’m impressed. PerfectTablePlan is heavily tested and I don’t think any of the issues found are the cause of bugs in PerfectTablePlan, but it shows the potential of the tool.

The documentation is here. But, on Windows, you just need to start the Cppcheck GUI (in C:\Program files\Cppcheck, they appear to be too modest to add a shortcut to your desktop), select Check>Directory… and browse to the source directory you want to check. Any issues found will then be displayed.

You can also set an editor to integrate with, in Edit>Preferences>Applications. Double clicking on an issue will then display the appropriate line in your editor of choice.

Cppdepend is available with a GUI on Windows and as a command line tool on a range of platforms. There is also an Eclipse plugin. See the sourceforge page for details on platforms and IDEs supported. You can even write your own Cppcheck rules.

Cppcheck could be a very valuable additional layer in my defence in depth approach to QA. I have added it to my checklist of things to do before each new release.

Programming skills wanted

I am looking to outsource some self-contained programming tasks in areas that I don’t have expertise in. I am hoping that someone reading this blog might be able to help (or know someone that can) so I don’t have to go through outsourcing sites. These are the two skills sets I am currently looking for:

  1. Javascript/CSS/HTML – To write a single page web app. This will have a relatively simple UI displaying data read from XML. The app will need to work on a wide range of browsers and devices. Ideally you should also have some web design skills, but this isn’t essential.
  2. C++/Qt 4/OpenGL – To write a relatively simple 3D visualization model that runs on Windows and Mac. This will involve populating a 3D space with specified shapes and allowing simple movement around it.

Details:

  • I am expecting that I will need 2 different people, but it is possible there might be someone out there with experience in both.
  • These are small projects (probably less than 2 weeks for task 1 and less than 1 week for task 2). But they might lead on to more work in future.
  • Time scales are reasonably relaxed. Ideally I would like the work to be finished by the end of September.
  • You can be based anywhere in the world, but must be able to communicate in English (written and spoken).
  • Full copyright to the work will pass to my company on full payment.
  • Obviously cost is an issue. If I have 2 promising candidates, I am likely to pick the cheaper one.

If you are interested in doing either of these tasks please email me ( andy at oryxdigital.com ) before the end of Friday 26th August with subject “programming work” and a brief outline of:

  • Which of the 2 tasks you are interested in.
  • Your relevant experience. Ideally including details of related projects completed.
  • Your daily rate in Pounds Sterlings or US dollars.

I will send detailed specs to a shortlist of the best candidates. The work will be awarded on the basis of fixed price bids against the spec. Please don’t apply unless you have relevant experience – if I wanted a programmer without experience in these areas I could do it myself. ;0)