Nibble Stew: July 2018

Thursday, July 26, 2018

Building native multiplatform GUI apps with Meson

A recent trend in multiplatform GUI applications is to create the core business logic of the application in something like C++, have it (optionally) expose a plain C interface and then create a gui on top of that using the native widget set of each supported platform. This means that the application uses GTK on Linux and other unixes, Cocoa on macOS, win32 API on Windows, Java widgets on Android and so on. This makes the application fully native on all platforms. The tradeoff is having to write the gui multiple times against not having to wrangle a multiplatform widget toolkit as your dependency.

Regardless of how you build your guis you need to have a build system that can build the application under all these different environments from a single code base. To this end I created a sample application called Platypus, which can be downloaded from this Github repo.

The code and compilation

The application itself is extremely simple. It consists of one shared library that returns a random number between 0 and 100 when called. It is implemented using C++ 11's random number generator functionality to ensure each platform has a toolchain new enough to handle it. The GUI applications built on top of it have a text label and a button. Pressing the button updates the text label with a new random number. There is also a test program that verifies that the library is working.

The GTK version is a plain C application. The gui is defined using a Glade interface definition file rather than building it by hand.

The macOS version has a GUI written in Objective C. The gui is defined as a XIB file created with XCode. It is built into a standard app bundle.

The Windows application is written in C++ (though it does not really use any C++ features) and has a gui laid out by hand.

All these guis have full platform integration with icons, an Info.plist, .desktop files and so on.

The installers

The GTK version can be built as a Flatpak in the usual way. The build manifest can be found in the repository's root.

The macOS version builds a standard .dmg installer that can be directly shipped to end users.

The Windows version builds an .MSI installer providing complete install/uninstall integration.

How complicated is it?

The entire build definition consists of 107 lines of Meson.

Screenshots

Here is the plain GTK version running as a Flatpak application on Kubuntu. Window icons and desktop integration work as you would expect.

Here is the macOS version showing the drive image, the installer window and the application running with proper platform integration.

Finally here is the Windows application showing the installed path location under Program Files, the application itself and the automatic integration to Windows' application uninstaller system.

Future plans

It would be cool to add an Android application as well as an iOS application written in Swift in the code base. Patches are welcome as always.

Wednesday, July 25, 2018

Why Git is terrible in four pictures

I was asked to write a blog post on why I dislike Git in general and its UI in particular. Here is a representative sample in four images.

Recently a pull request was filed that looked like this:

As you can see there is an extra merge commit. As is customary we wanted to get rid of it to get a clean rebase based merge history. To do that you'd first get a checkout of the code and look at the log, which looks like the following.

So far, so good. Now let's do a rebase --interactive. It looks like this:

Suddenly Git has chosen to silently remove the merge commit from this list. Why? I have no idea. The commit had changes in it, so it was not pruned because it was empty. If you then exit the editor without any changes (which usually means "do not change anything") then the commit is deleted and any changes that were in it are gone:

If your latter commits built on those changes, you get yummy merge conflicts for something that is conceptual a no-op.

This is the essence of working with Git. Most of the time it works sort of ok, but every now and then it will, without any warning or reason, completely screw you over, destroy your data and leave you stranded, forced to debug your way out of the resulting mess without any help.

"Of course it breaks, you should have used --do-not-do-the-idiotic-wrong-thing-which-for-some-reason-is-the-default command line option, everyone knows that, duh!"

A common kneejerk response to these kinds of problems is that it is somehow the user's own fault and that they should have memorized every quirk in the software in order to use it correctly (or at all). I'm certain some of you out there on the Internet had already started writing a strongly worded message to let me know that. Don't bother.

Whenever you have a piece of software that silently destroys user data, the fault always, always, ALWAYS lies with the program. Even if "it only happens rarely". Even if you think "it's the user's fault". Even if you personally know how the problem could have been avoided. The flaw is ABSOLUTELY ALWAYS in the software. Never in users. Ever.

Any attempt at shifting the cause to the user, for whatever reason, is victim blaming. Don't do it.

Tuesday, July 17, 2018

How expensive is globbing for sources in large projects

A common holy war in build systems is whether you should explicitly list all sources that make up a target or use a globbing pattern. There are both technical and non-technical arguments on both sides. The latter mostly deal with reliability and flexibility vs convenience. In this post we are going to ignore them completely and instead focus on the technical parts, specifically the overhead of globbing. The measurement script used can be downloaded from this repo.

In this test we used the full checkout of Chromium source code. The tests were run under Windows, since it is noticeably slower than Linux on both file operations and process invocations. The task simulation consists of roughly three parts:

Scan the source tree for all directories that contain sources
Generate glob patterns for detected directories (corresponding roughly to "one target for all sources in one directory")
Run the globs

This ignores a bunch of steps, such as serialising the glob results to files and calculating the delta between two glob sets. These are probably fairly fast compared to file access operations, though.

Scanning the source tree and generating the globs

There is no direct correlation between this step and a regular build system. It is mostly interesting as a comparison between file operations between a hot and a cold cache. Running the scan on a cold cache takes 2 minutes but for a warm cache about 6 seconds.

Since this step is always run first, the following tests are all operating with a hot cache.

The actual globbing

Running all globs on the Chromium source tree takes between 2 and 6 seconds. This is the absolute lowest time that can be obtained for a no-op build without daemons because all globs must be re-evaluated every time.

The rule of thumb for UI design is that everything under one second is perceived as instantaneous. This means that for these sizes globbing causes a noticeable delay. Whether this is seen as insignificant or aggravating depends on each user.

Extra bonus: C++ modules

Since we have the measurement script, let's use it for something more interesting. Modules are an upcoming C++ feature to increase build times and a ton of other coolness depending on who you ask. The current specification works by having a kind of "module export declaration" at the beginning of source files. The idea is that you first compile those to generate a sort of a module declaration file and then you can start the actual compilation that uses said files.

If you thought "waitaminute, that sounds exactly like how FORTRAN is compiled", you are correct. Because of this it has the same problem that you can't compile source files in an arbitrary order, but instead you must first somehow scan them to find out the interdependencies between source (not header) files. In practice what this means is that instead of single-phase compilation all files must be processed twice. All scan operations must be done before any compilation jobs can start because otherwise you might start to compile a file before its dependencies are fully processed.

The scanning can be done in one of two ways. Either the build system scans the sources meaning it needs to understand the syntax of source files or the compiler can be invoked in a special preprocessing mode. Note that build systems such as Ninja do not do any such operations by themselves but instead always invoke external processes to do their work.

Testing the performance impact of these two is straightforward. The first one can be done by reading the first ten lines of each source file and then throwing them away. Measuring this time gives a fairly good estimate of the file processing overhead. The second way can be measured by doing the exact same thing but also invoking the compiler with no-op command line arguments to get the process invocation overhead.

Scanning the files directly takes roughly 120 seconds. For an 8 core machine this means a 15 second delay (at minimum) before any compilation tasks can begin. This is not great but for a full build it should be tolerable.

When spawning a compiler process the same operation takes 69 minutes. This is intolerably slow and would require an order of magnitude speedup in compilation times to be worthwhile. Unlike regular compilations, dependency scanning can not be sped up with unity builds because the specification requires that the module declaration must be at the very beginning of source files (and presumably there can not be more than one in a single TU).