Tuesday, October 20, 2020

Cargo-style dependency management for C, C++ and other languages with Meson

My previous blog post about modern C++ got a surprising amount of feedback. Some people even reimplemented the program in other languages, including one in Go, two different ones in Rust and even this slightly brain bending C++ reimplementation as a declarative style pipeline. It also got talked about on Reddit and Hacker news. Two major comments that kept popping up were the following.

  1. There are potential bugs if the program is ever extended to support non-ASCII text
  2. It is hard to use external dependencies in C++ (and by extension C)
Let's solve both of these problems at the same time. There are many "lite" solutions for doing Unicode aware text processing, but we're going to use the 10 000 kilogram hammer: International Components for Unicode. The actual code changes are not that big, interested parties can go look up the details in this Github repo. The thing that is relevant for this blog post is the build and dependency management. The Meson build definition is quite short:

project('cppunicode', 'cpp',
        default_options: ['cpp_std=c++17',
icu_dep = dependency('icu-i18n')
thread_dep = dependency('threads')
executable('cppunicode', 'cppunicode.cpp',
           dependencies: [icu_dep, thread_dep])

The threads dependency is for the multithreaded parts (see end of this post). I developed this on Linux and used the convenient system provided ICU. Windows and Mac do not provide system libs so we need to build ICU from scratch on those platforms. This is achieved by running the following command in your project's source root:

$ meson wrap install icu
Installed icu branch 67.1 revision 1

This contacts Meson's WrapDB server and downloads build definition files for ICU. That is all you need to do. The build files do not need any changes. When you start building the project, Meson will automatically download and build the dependency. Here is a screenshot of the download step:

Here it is building in Visual Studio:

And here is the final built result running on macOS:

One notable downside of this approach is that WrapDB does not have all that many packages yet. However I have been told that given the next Meson release (in a few weeks) and some upstream patches, it is possible to build the entire GTK widget toolkit as a subproject, even on Windows. 

If anyone wants to contribute to the project, contributions are most welcome. You can for example convert existing projects and submit them to wrapdb or become a reviewer. The Meson web site has the relevant documentation

Appendix: making it parallel

Several people pointed out that while the original program worked fine, it only uses one thread. This may be a bottleneck and that "in C++ it is hard to execute work in parallel". This is again one of those things that has gotten a lot better in the last few years. The "correct" solution would be to use the parallel version of transform_reduce. Unfortunately most parallel STL implementations are still in the process of being implemented so we can't use those in multiplatform code. We can, however, roll our own fairly easily, without needing to create or lock a single mutex by hand. The code has the actual details, but the (slightly edited) gist of of it is this:

for(const auto &e:
    std::filesystem::recursive_directory_iterator(".")) {
    if(futures.size() > num_threads) {
        pop_future(word_counts, futures);
while(!futures.empty()) {
   pop_future(word_counts, futures);

Here the count_word_files function calculates the number of words in a single file and the pop_future function joins individual results to the final result. By using a share-nothing architecture, pure functions and value types all business logic code can be written as if it was single threaded and the details of thread and mutex management can be left to library code. Haskell fans would be proud (or possibly horrified, not really sure).

Thursday, October 15, 2020

Does C++ still deserve the bad rap it has had for so long?

Traditionally C++ has been seen by many (and you know who you are) as just plain bad: the code is unreadably verbose, error messages are undecipherable, it's unsafe, compilation takes forever and so on. In fact making fun of C++ is even a fun pastime for some. All of this was certainly true in the 90s and even as recent as 10 years ago. But is it still the case? What would be a good way to determine this?

A practical experiment

Let's write a fairly simple program that solves a sort-of-real-worldish problem and see what comes out of it. The problem we want to solve is the following:
Find all files with the extension .txt recursively in the subdirectories of the current directory, count the number of times each word appears in them and print the ten most common words and the number of times they are used.

We're not going to go for extreme performance or handling all possible corner cases, instead going for a reasonable implementation. The full code can be found by following this link.

The first thing we need is a way to detect files with a given extension and words within text. This calls for regular expressions:

const std::regex fname_regex("\\.txt$", std::regex_constants::icase);

const std::regex word_regex("([a-z]{2,})", std::regex_constants::icase);

This might a bit verbose, but quite readable. This only works for ASCII text, but for the purposes of this experiment it is sufficient. To calculate the number of times each word is seen, we need a hash table:

std::unordered_map<std::string, int> word_counts;

Now we need to iterate over all files in the current directory subtree.

for(const auto &e: std::filesystem::recursive_directory_iterator("."))

Skip everything that is not a .txt file.

if(!e.is_regular_file()) {
if(!std::regex_search(e.path().c_str(), fname_regex)) {

Process the file line by line:

std::ifstream current_file(e.path());
for (std::string line; std::getline(current_file, line); )

For each line we need to find all words that match the regular expression.

std::sregex_iterator word_end;
for(auto it = std::sregex_iterator(line.begin(), line.end(), word_regex); it != word_end; ++it)

This is a bit verbose, but it takes care of all the state tracking needed to run the regex multiple times on the same input string. Doing the same by hand is fairly tedious and error prone. Now that we have the matches, we need to convert them to standalone lowercase words.

std::string word{it->str()};
for(auto &c : word) {
    c = std::tolower(c);

Lowercasing strings is a bit cumbersome, granted, but this is at least fairly readable. Then on to incrementing the word count.


That's all that is needed to count the words. This can't be done directly in the hash map so we need to convert the data to an an array. It needs some setup code.

struct WordCount {
    std::string word;
    int count;

std::vector<WordCount> word_array;

Since we know how many entries will be in the array, we can reserve enough space for it in advance. Then we need to iterate over all entries and add them to the array.

for(const auto &[word, count]: word_counts) {
    word_array.emplace_back(WordCount{word, count});

This uses the new structured bindings feature, which is a lot more readable than the old way of dealing with iterator objects or pairs with their firsts and their seconds and all that horror. Fortunately no more.

The simple way of getting the 10 most used entries is to sort the array and then grab the 10 first elements. Let's do something slightly swaggier instead [0]. We'll do a partial sort and discard all entries after 10. For that we need a descending sorting criterion as a lambda.

auto count_order_desc = [](const WordCount &w1, const WordCount &w2) { return w2.count < w1.count; };

Using it is straightforward.

const auto final_size = std::min(10, (int)word_array.size());
std::partial_sort(word_array.begin(), word_array.begin() + final_size, word_array.end(), count_order_desc);
word_array.erase(word_array.begin() + final_size, word_array.end());

All that remains is to print the final result.

for(const auto& word_info: word_array) {
    std::cout << word_info.count << " " << word_info.word << "\n";


As safety and security are important features in current sofware development, let's examine how safe this program is. There are two main safety points: thread safety and memory safety. As this program has only one thread, it can be formally proven to be thread safe. Memory safety also divides into two main things to note: use after frees (or dangling references) and out of bound array accesses.

In this particular program there are no dangling references. All data is in value types and the only references are iterators. They are all scoped so that they never live past their usages. Most are not even stored into named variables but instead passed directly as function arguments (see especially the call to partial_sort). Thus they can not be accessed after they have expired. There is no manual resource management of any kind. All resources are freed automatically. Running the code under Valgrind yields zero errors and zero memory leaks.

There is one place in the code where an out-of-bounds access is possible: when creating the iterator pointing to 10 elements past the beginning of the array [2]. If you were to forget to check that the array has at least 10 elements, that would be an immediate error. Fortunately most standard libraries have a way to enable checks for this. Here's what GCC reports at runtime if you get this wrong:

Error: attempt to advance a dereferenceable (start-of-sequence) iterator 100000 steps, which falls outside its valid range.

Not really a perfect safety record, but overall it's fairly good and certainly a lot better than implementing the same by manually indexing arrays.


Compiling the program takes several seconds, which is not great. This is mostly due to the regex header, which is known to be heavyweight. Some of the compiler messages encountered during development were needlessly verbose. The actual errors were quite silly, though, such as passing arguments to functions in the wrong order. This could definitely be better. It is reported that the use of concepts in C++20 will fix most of this issue, but there are no practical real-world usage reports yet.

Unique features

This code compiles, runs and works [1] on all major platforms using only the default toolchain provided by the OS vendor without any external dependencies or downloads. This is something that no other programming language can provide as of date. The closest you can get is plain C, but its standard library does not have the necessary functionality. Compiling the code is also simple and can be done with a single command:

g++ -o cppexample -O2 -g -Wall -std=c++17 cppexample.cpp


If we try to look at the end result with objective eyes it would seem to be fairly nice. The entire implementation takes fewer than 60 lines of code. There's nothing immediately terrible that pops up. What is happening at each step is fairly understandable and readable. There's nothing that one could immediately hate and despise for being "awful", as is tradition.

This is not to say you could still not hate C++. If you want to, go right ahead. There are certainly reasons for it. But if you choose to do that, make sure you hate it for real actual reasons of today, rather than things that were broken decades ago but have been fixed since.

[0] There is an even more efficient solution using a priority queue that does not need the intermediate array. Implementing it is left as an exercise to the reader.
[1] Reader beware, I have only proven this code to be portable, not tried it.
[2] The implementation in [0] does not have this problem as no iterators are ever modified by hand.

Tuesday, October 6, 2020

Is your project a unique snowflake or do you just think that it is?

If there is a national sport for programmers, it is reinventing existing things from scratch. This seems to be especially common inside corporations that tend to roll their own rather than, for example, using an existing open source library. If you try to ask around why this has been done, you typically get some variation of this:
We can't use off-the-shelf solutions because we have unique needs and requirements that no-one else has.

Now, there genuinely are cases where this is true, but it mostly happens only in very spesialised cases, such as when you are google-scale, do space engineering or something similar. There might also be legal or regulatory reasons that you must own all code in a product. Most projects are not like this. In fact almost always someone (typically hundreds of someones) has had the exact same problem and solved it. Yet people seem to keep reinventing the wheel for no real purpose. If you ever find yourself in a debate on why people should use existing solutions rather than roll your own, here are some talking points to consider.

How have you established that your needs are actually unique?

Very often when someone says "the needs for <project X> are unique" what they are actually saying is "this needs a feature I have never personally encountered before". This is not due to malice, it is just how human beings seem to behave. If one has a rough idea of how to solve a problem by coding it from scratch but no knowledge of an existing solution, people tend to go with the former rather than spend time on researching the problem. I don't know why. This is especially true if there is external pressure on "getting something going fast".

I know this, because I have actually done it myself. Ages ago I was working on some Java code and needed to communicate with a different process. I was starting to write a custom socket protocol when my coworker asked why am I not simply using regular HTTP with Jetty. The answer, to the surprise of absolutely no-one, is that I had never heard of it. This single sentence saved days of implementation work and probably the same amount in ongoing maintenance and debugging work.

We need more performance than existing solutions can provide

This splits nicely into two parts. In the first one you actually need the performance. This is rare, but it happens. Even so, you might consider optimizing the existing solution first. If you are successfull, submit the changes upstream. This way they benefit everyone and reduce the ongoing maintenance burden to roughly zero.

The other case is that all the performance you think you need, you don't actually need. This is surprisingly common. It turns out that modern computers and libraries are so fast that even "reasonably ok" is performant enough most of the time for most problems [0]. As an example I was working on a product that chose to roll its own functionality X rather than using the standard solution in Linux because it needed to do "thousands of operations per second". These operations were only done in response to human interaction, so there was no way it could generate more than 10 or so events per second. But still. Must have performance, so multiple months of hand-rolling it was.

To add insult to injury, I did some measurements and it turned out that the existing solution could do one thousand operations per second without even needing to go to C. The Python bindings were fast enough.

We need the ability to change the implementation at will

One major downside of using an existing open source solution is that you have to give up some control. Changes must be submitted to people outside your immediate control. They might say no and you can't override them with office politics. In-house code does not have this problem.

This control comes with a significant downside. What fairly often happens is that some developers write and maintain the new functionality and everything is rosy. Then time happens and the original coders switch teams and jobs, become managers and so on. They take the knowledge required to change the system with them. Unless your company has a very strong culture of transferring this knowledge, and most companies most definitely don't, the code will bitrot, fossilize and become a massive clump of legacy that must not be touched because there is no documentation, no tests and nobody really understands it any more.

So even if the corporation has the possibility to change the code at will they lack the ability to do so.

Sometimes people just plain lie

There are two main reasons that explain pretty much everything what people do, say and recommend:
  1. The reason they give to others
  2. The real reason
Even for "purely technical" decisions like these, the two can be drastically different. Suppose some developer has done several similar "boring" projects. Then when the time comes to choose the approach for the new project. If they choose an existing system, they know that the next project will be just like the preceding five. If, on the other hand, some new tech is chosen, then the entire task becomes more meaningful and exciting. In this case the stated reason is technical fitness, but the real reason is frustration. Another way of stating this is that when asked "should we use library X or write our own from scratch" the answer the coder would want to give is "I don't want to work on that project". 

This is not something that people won't say out loud because it gets you easily labeled as a "prima-donna", "work avoider" or "elitist". Many people will also have trouble saying it because it comes from a "bad" emotional response. All of this leads to shame and people are willing to go to great lengths to avoid feeling ashamed.

The same reasoning applies to other "impure" emotions such as infatuation ("OMG this new tech is so cool"), disgust ("using tech X means I have to work with Bob from department X"), infidelity ("either you give me this or I'm accepting the job offer from a competitor"), megalomania ("creating this new framework will make me a superstar celebrity coder") and the ever-popular self-disappointment ("I got into programming because I wanted to create new things like an architect, but ended up as a code janitor").

Tuesday, September 15, 2020

Want GCC's cleanup attribute in Visual Studio? Here's what to do.

A common pain point for people writing cross platform C code is that they can't use GCC's cleanup attribute and by extension GLib's g_auto family of macros. I had a chat with some Microsoft people and it turns out it may be possible to get this functionality added to Visual Studio. They add features based on user feedback. Therefore, if you want to see this functionality added to VS, here's what you should do:

  1. Create a Microsoft account if you don't already have one.
  2. Upvote this issue.
  3. Spread the word to other interested people.

Sunday, September 13, 2020

Proposal for a computer game research topic: the walk/game ratio

I used to play a fair bit of computer games but in the recent years the amount of time spent on games has decreased. Then the lockdown happened and I bought a PS4 and got back into gaming, which was fun. As often is the case, once guy get back into something after a break you find yourself paying attention to things that you never noticed before.

In this particular case it was about those parts of games where you are not actually playing the game. Instead you are walking/driving/riding from one place to another because the actual thing you want to do is somewhere else. A typical example of this is Red Dead Redemption II (and by extension all GTAs). At first wondering the countryside is fun and immersive but at some point it becomes tedious and you just wish to be at your destination (fast travel helps, but not enough). Note that this does not apply to extra content. Having a lush open sandbox world that people can explore at their leisure is totally fine. This is about "grinding by walking" that you have to do in order to complete the game.

This brings up several interesting questions. How much time, on average, do computer games require players to spend travelling from one place to another as opposed to doing the thing the game is actually about (e.g. shooting nazis, shooting zombie nazis, hunting for secret nazi treasure and fighting underwater nazi zombies)? Does this ratio vary over time? Are there differences between genres, design studios and publishers? It turns out that determining these numbers is fairly simple but laborious. I have too many ongoing projects to do this myself, so here's a research outline for anyone to put in their research grant application:

  1. Select a representative set of games.
  2. Go to speedrun.com and download the fastest any-% glitchless run available.
  3. Split the video into separate segments such as "actual gameplay", "watching unskippable cutscenes", "walkgrinding" and "waiting for game to finish loading".
  4. Tabulate times, create graphs

Hypothetical results

As this research has not been done (that I know of and am able to google up) we don't know what the results would be. That does not stop us from speculating endlessly, so here are some estimates that this research might uncover:
  • Games with a lot of walkgrdinding: RDR II, Assassin's Creed series, Metroid Prime.
  • Games with a medium amount of walkgrinding: Control, Uncharted
  • Games with almost no walkgrinding: Trackmania, Super Meat Boy.
  • Games that require the player to watch the same unskippable cutscenes over and over: Super Mario Sunshine
  • Newer games require more walkgrinding simply because game worlds have gotten bigger

Sunday, August 30, 2020

Resist the urge to argue about app store security

Recently Miguel de Icaza wrote a blog post arguing that closed computing platforms where a major US corporation decides what software users are allowed to install are a good thing. This has, naturally, caused people to become either confused, disappointed or angry. Presumably many people are writing responses and angry comments. I almost started one writing one pointing out all the issues I found in the post.

Doing that would probably create a fairly popular blog post with followups. It might even get to the reddits and hackernewses and generate tons of comments where people would duke it out on issues on user choice vs the safety provided by a curated walled garden. There would be hundreds, if not thousands, of snarky tweets that make their poster feel superior for a while but are ultimately not productive. To quote Star Trek Deep Space Nine:
Spare me please-think-of-the-children speech and I'll spare you the users-must-have-control-over-their-own-devices speech [1].
Having us, the user and developer community, argue about this issue is pointless, unproductive and actively harmful. This particular phenomenon is not new, it even has a name. In fact this approach is so old that the name is in latin: Divide et impera. Divide and conquer. All the time and energy that we spend on arguing this issue among ourselves is time not spent on working towards a solution.

The actual solution to this issue is conceptually so simple it could be called trivial. The entire problem at hand is one that has been created by Apple. They are also the ones that can solve it. All they have to do is to add one new piece of functionality to iOS devices. Specifically that users who so choose, can change an option in the device they own allowing them to download, install and use any application binaries freely from the Internet. Enabling this functionality could be done, for example, in a similar way to how Android phones enable developer mode. Once implemented Apple would then make a public statement saying that this workflow is fully supported and that applications obtained in this way will, now and forevermore, have access to all the same APIs as official store apps do.

This is all it takes! Further, they could make it so that IT departments and concerned parents could disable this functionality on their employees' and children's devices so that they can only obtain apps via the app store. This gives both sets of users exactly what they want. Those who prefer living in a walled curated garden can do so. Those with the desire and willingness to venture outside the walls and take responsibility of their own actions can do so too and still get all the benefits of sandboxing and base platform security.

Apple could do this. Apple could have done this at launch. Apple could have done this at any time since. Apple has actively chosen not to do this. Keep this is mind if you ever end up arguing about this issue on the Internet. People who have different priorities and preferences are not "the enemy". If you get into the flaming and the shouting you have been divided. And you have been conquered.

[1] Might not be a word-for-word accurate transcription.

Friday, August 28, 2020

It ain't easy being a unique snowflake, the laptop purchasing edition

As the lockdown drags on I have felt the need to buy a new laptop as my old one is starting to age. A new category in the laptop market is 2-in-1 laptops with full drawing pen support. It has been a long time since I have done any real drawing or painting and this seemed like a nice way to get back into that. On the other hand the computer also needs to be performant for heavy duty software development such as running the Meson test suite (which is surprisingly heavy) and things like compiling LLVM. This yields the following requirements:

  • A good keyboard, on the level of Thinkpads or 2015-ish Macbook Pros
  • 13 or 14 inch screen, at least Retina level resolution, HDR would be nice, as matte as possible
  • Pressure and tilt support for the pen
  • At least 16GB of RAM
  • USB A, C and HDMI connectors would be nice
  • At least Iris level graphics with free drivers (so no NVidia)
  • A replaceable battery would be nice
  • Working dual boog (I need Win10 for various things), full Linux support (including wifi et al)
After a bunch of research it turns out that this set of requirements might be impossible to fullfill. Here are the tree best examples that I found.

HP Elite Dragonfly

The documentation is unclear on whether the pen supports tilting. More importantly this model is only available with Intel 8th gen processors and online reviews say the screen is super glossy.

Lenovo Yoga X1 Gen5

Pen does not support tilting. The graphics card is a standard Intel HD, not Iris.

Dell XPS 13 2-in-1 7390

This is the one that gets the closest. There is no USB A or HDMI connectors, and the pen requires an AAAA battery. This is annoying but tolerable. Unfortunately the keyboard is of the super thin type, which is just not usable.

So now what?

Probably going to stick with the current ones for now. New models come out at a fairly steady pace, so maybe this mythical white whale will become available for purchase at some point. Alternatively I might eventually just fold, give up on some requirements and buy a compromise machine. Typically this causes the dream machine to become available for purchase immediately afterwards, when it is too late. 

Monday, August 17, 2020

Most "mandatory requirements" in corporations are imaginary

In my day job I work as a consultant. Roughly six months ago my current client had a non-negotiable requirement that consultants are not allowed to work remotely. Employees did have this privilege. This is, clearly, a stupid policy and there have been many attempts across the years to get it changed. Every time someone in the management chain has axed the proposal with some variation of "this policy can not be changed because this is our policy and thus can not be changed", possibly with a "due to security reasons" thrown in there somewhere.

Then COVID-19 happened and the decision came to lock down the head office. Less than a day after this everyone (including consultants) got remote working rights, VPN tokens and all other bits and bobs. The old immutable, mandatory, unalterable and just-plain-how-we-do-things rules and regulations seemed to vanish to thin air in the blink of an eye. The question is why did this happen?

The answer is simple: because it became mandatory to due to external pressure. A more interesting question would be if it really was that simple, how come this had not happened before? Further, are the same reasons that blocked this obvious improvement for so long are also holding back other policy braindeadisms that reduce employee productivity. Unfortunately the answers here are not as clear-cut and different organizations may have different underlying causes for the same problem.

That being said, let's look at one potential causes: the gain/risk imbalance. Typically many policy and tech problems occur at a fairly low level. Changes benefit mostly the people who, as one might say, are doing the actual work. In graph form it might look like this.

This is not particularly surprising. People higher up the management chain have a lot on their plate. They probably could not spend the time to learn about benefits from low level work flow changes even if they wanted to and the actual change will be invisible to them. On the other hand managers are fairly well trained to detect and minimize risk. This is where things fall down because the risk profile for this policy change is the exact opposite.

The big question on managers' minds (either consciously or unconsciously) when approving a policy change is "if I do this and anything goes wrong, will I get blamed?". This should not be the basis of choice but in practice it sadly is. This is where things go wrong. The people who would most benefit from the change (and thus have the biggest incentive to get it fixed) do not get to make the call. Instead it goes to people who will see no personal benefit, only risk. After all, the current policy has worked well thus far so don't fix it if it is not broken. Just like no one ever got fired for buying IBM, no manager has ever gotten reprimanded for choosing to uphold existing policy.

This is an unfortunate recurring organizational problem. Many policies are chosen without full knowledge of the problem sometimes this information is even impossible to get if the issue is new and no best practices have yet been discovered. Then it becomes standard practice and then a mandatory requirement that can't be changed, even though it is not really based on anything, provides no benefit and just causes harm. Such are the downsides of hierarchical organization charts.

Saturday, August 8, 2020

The second edition of the Meson manual is out

I have just uploaded the second edition of the Meson manual to the web store for your purchasing pleasure. New content includes things like:

  • Reference documentation updated to match 0.55
  • A new chapter describing "random tidbits" for practical usage
  • Cross compilation chapter update with new content
  • Bug fixes and minor updates
Everyone who has bought the book can download this version (and in fact all future updates) for free.

Sales stats up to this point

The total number of sales is roughly 130 copies corresponding to 3500€ in total sales. Expenses thus far are a fair bit more than that.

Wednesday, July 29, 2020

About that "Google always builds everything from source every time" thing

One of the big battles of dependencies is whether you should use somehow prebuilt libraries (e.g. the Debian-style system packaging) or vendor all the sources of your deps and build everything yourself. Whenever this debat gets going, someone is going to do a "well, actually" and say some kind of variation of this:
Google vendors all dependencies and builds everything from source! Therefore that is clearly the right thing to do and we should do that also.
The obvious counterargument to this is the tried-and-true if all your friends jumped off a bridge would you do it too response known by every parent in the world. The second, much lesser known counterargument is that this statement is not actually true.

Google does not actually rebuild all code in their projects from source. Don't believe me? Here's exhibit A:

The original presentation video can be found here. Note that the slide and the video speak of multiple prebuilt dependencies, not just one [0]. Thus we find that even Google, with all of their power, money, superior engineering talent and insistence to rebuild everything from source, do not rebuild everything from source. Instead they have to occasionally use prebuilt third party libraries just like the rest of the world. Thus a more accurate form of the above statement would be this:
Google vendors most dependencies and builds everything from source when possible and when that is not the case they use prebuilt libraries but try to keep quiet about it in public because it would undermine their publicly made assertion that everyone should always rebuild everything from source.
The solution to this is obvious: you just rebuild all things that you can from sources and get the rest as prebuilt libraries. What's the big deal here? By itself there would not be, but this ideology has consequences. There are many tools and even programming languages designed nowadays that only handle the build-from-source case because obviously everyone has the source code for all their deps. Unfortunately this is just not true. No matter how bad prebuilt no-access-to-source libraries are [1], they are also a fact of life and must be natively supported. Not doing that is a major adoption blocker. This is one of the unfortunate side effects of dealing with all the nastiness of the real world instead of a neat idealized sandbox.

[0] This presentation is a few years old. It is unknown whether there are still prebuilt third party libraries in use.

[1] Usually they are pretty bad.

Sunday, July 26, 2020

Pinebook Pro longer term usage report

I bought a Pinebook Pro in the first batch, and have been using it on and off for several months now. Some people I know wanted to know if it is usable as a daily main laptop.

Sadly, it is not. Not for me at least. It is fairly close though.

Random bunch of things that are annoying or broken

I originally wanted to use stock Debian but at some point the Panfrost driver broke and the laptop could not start X. Eventually I gave up and switched to the default Manjaro. Its installer does not support an encrypted root file system. A laptop without an encrypted disk is not really usable as a laptop as you can't take it out of your house.

The biggest gripe is that everything feels sluggish. Alt-tabbing between Firefox and a terminal takes one second, as does switching between Firefox tabs. As an extreme example switching between channels in Slack takes five to ten seconds. It is unbearably slow. The wifi is not very good, it can't connect reliably to an access point in the next room (distance of about 5 meters). The wifi behaviour seems to be distro dependent so maybe there are some knobs to twiddle.

Video playback on browsers is not really nice. Youtube works in the default size, but fullscreen causes a massive frame rate drop. Fullscreen video playback in e.g. VLC is smooth.

Basic shell operations are sluggish too. I have a ZSH prompt that shows the Git status of the current directory. Entering in a directory that has a Git repo freezes the terminal for several seconds. Basically every time you need to get something from disk that is not already in cache leads to a noticeable delay.

The screen size and resolution scream for fractional scaling but Manjaro does not seem to provide it. Scale of 1 is a bit too small and 2 is way too big. The screen is matte, which is totally awesome, but unfortunately the colors are a bit muted and for some reason it seems a bit fuzzy. This may be because I have not used a sub-retina level laptop displays in years.

The trackpad's motion detector is rubbish at slow speeds. There is a firmware update that makes it better but it's still not great. According to the forums someone has already reverse engineered the trackpad and created an unofficial firmware that is better. I have not tried it. Manjaro does not provide a way to disable tap-to-click (a.k.a. the stupidest UI misfeature ever invented including the emojibar) which is maddening. This is not a hardware issue, though, as e.g. Debian's Gnome does provide this functionality. The keyboard is okayish, but sometimes detects keypresses twice, which is also annoying.

For light development work the setup is almost usable. I wrote a simple 3D model viewer app using Qt Creator and it was surprisingly smooth all round, the 3D drivers worked reliably and so on. Unfortunately invoking the compiler was again sluggish (this was C++, though, so some is expected). Even simple files that compile instantly on x86_64 took seconds to build.

Can the issues be fixed?

It's hard to say. The Panfrost driver is under heavy development, so it will probably keep getting better. That should fix at least the video playback issues. Many of the remaining issues seem to be on the CPU and disk side, though. It is unknown whether there are enough optimization gains to be had to make the experience fully smooth and, more importantly, whether there are people doing that work. It seems feasible that the next generation of hardware will be fast enough for daily usage.

Bonus content: compiling Clang

Just to see what would happen, I tried whether it would be possible to compile Clang from source (it being the heaviest fairly-easy-to-build program that I know of). It turns out that you can, here are the steps for those who want to try it themselves:
  • Checkout Clang sources
  • Create an 8 GB swap file and enable it
  • Configure Clang,  add -fuse-ld=gold to linker flags (according to Clang docs there should be a builtin option for this but in reality there isn't) and set max parallel link jobs to 1
  • Start compilation with ninja -j 4 (any more and the compilation jobs cause a swap storm)
  • If one of the linker jobs cause a swap storm, kill the processes and build the problematic library by hand with ninja bin/libheavy.so
  • Start parallel compilation again and if it freezes, repeat as above
After about 7-8 hours you should be done.

Sunday, July 19, 2020

The ABI stability matryoshka

In the C++ on Sea conference last week Herb Sutter had a talk about replacing an established thingy with a new version. Obviously the case of ABI stability came up and he answered with the following (the video is not available so this quote is only approximate, though there is an earlier version of the talk viewable here):
Backwards compatibility is important so that old code can keep working. When upgrading to a new system it would be great if you could voluntarily opt into using the old ABI. So far no-one has managed to do this but if we could crack this particular nut, making major ABI changes would become a lot easier.
Let's try to do exactly that. We'll start with a second (unattributed) quote that often gets thrown around in ABI stability discussions:
Programming language specifications do not even acknowledge the existance of an ABI. It is wholly a distro/tool vendor problem and they should be the ones to solve it.
Going from this we can find out the actual underlying problem, which is running programs of two different ABI versions at the same time on the same OS. The simple solution of rebuilding the world from scratch does not work. It could be done for the base platform but, due to business and other reasons, you can't enforce a rebuild of all user applications (and those users, lest we forget, pay a very hefty amount of money to OS vendors for the platform their apps run on). Mixing new and old ABI apps is fragile and might fail due to the weirdest of reasons no matter how careful you are. The problem is even more difficult in "rolling release" cases where you can't easily rebuild the entire world in one go such as Debian unstable, but we'll ignore that case for now.

It turns out that there already exists a solution for doing exactly this: Flatpak. Its entire reason of existance is to run binaries with different ABI (and even API) on a given Linux platform while making it appear as if it was running on the actual host. There are other ways of achieving the same, such as Docker or systemd-nspawn, but they aim to isolate the two things from each other rather than unifying them. Thus a potential solution to the problem is that whenever an OS breaks ABI compatibility in a major way (which should be rare, like once every few years) it should provide the old ABI version of itself as a Flatpak and run legacy applications that way. In box diagram architecture format it would look like this:

The main downside of this is that the OS vendor's QA department has twice as much work as they need to validate both ABI versions of the product. There is also probably a fair bit of work work to make the two version work together seamlessly, but once you have that you can do all sorts of cool things, such as building the outer version with stdlibc++'s debug mode enabled. Normally you can't do that easily as it massively breaks ABI, but now it is easy. You can also build the host with address or memory sanitizer enabled for extra security (or just debugging).

If you add something like btrfs subvolumes and snapshotting and you can do all sorts of cool things. Suppose you have a very simple system with a web server and a custom backend application that you want to upgrade to the new ABI version. It could go something like this:

  1. Create new btrfs subvolume, install new version to that and set up the current install as the inner "Flatpak" host.
  2. Copy all core system settings to the outer install.
  3. Switch the main subvolume to the new install, reboot.
  4. Now the new ABI environment is running and usable but all apps still run inside the old version.
  5. Copy web server configuration to the outer OS and disable the inner one. This is easy because the all system software has the exact same version in both OS installs. Reboot.
  6. Port the business app to run on the new ABI version. Move the stored data and configuration to the outer version. The easiest way to do this is to have all this data on its own btrfs subvolume which is easy to switch over.
  7. Reboot. Done. Now your app has been migrated incrementally to the new ABI without intermediate breakage (modulo bugs).
The best part is that if you won't or can't upgrade your app to the new ABI, you can stop at step #5 and keep running the old ABI code until the whole OS goes out of support. The earlier ABI install will remain as is, can be updated with new RPMs and so on. Crucially this will not block others from switching to the new ABI at their leisure. Which is exactly what everyone wanted to achieve in the first place.

Tuesday, July 7, 2020

What if? Revision control systems did not have merge

A fun design exercise is to take an established system or process and introduce some major change into it, such as adding a completely new constraint. Then take this new state of things, run with it and see what happens. In this case let's see how one might design a revision control system where merging is prohibited. Or, formulated in a slightly different way:
What if merging is to revision control systems as multiple inheritance is to software design?

What is merging used for?

First we need to understand what merging is used for so that wa can develop some sort of a system that achieves the same results via some other mechanism. There are many reasons to use merges, but the most popular ones include the following.

An isolated workspace for big changes

Most changes are simple and consists of only one commit. Sometimes, however, it is necessary to make big changes with intermediate steps, such as doing major refactoring operations. These are almost always done in a branch and then brought in to trunk. This is especially convenient if multiple people work on the change.

Trunk commits are always clean

Bringing big changes in via merges means that trunk is always clean and buildable. More importantly bisection works reliably since all commits in trunk are known good. This is typically enforced via a gating CI. This allows big changes to have intermediate steps that are useful but broken in some way so they would not pass CI. This is not common, but happens often enough to be useful.

An alternative to merging is squashing the branch into a single commit. This is suboptimal as it destroys information breaking for example git blame -kind of functionality as all changes made point to a single commt made by a single person (or possibly a bot).

Fix tracking

There are several systems that do automatic tracking of bug fixes to releases. The way this is done is that a fix is written in its own branch. The bug tracking system can then easily see when the fix gets to the various release branches by seeing when the bugfix branch has been merged to them.

A more linear design

In practice many (possibly even most) projects already behave like this. They keep their histories linear by rebasing, squashing and cherry picking, never merging. This works but has the downsides mentioned above. If one spends some time thinking about this problem the fundamental disconnect comes fairly clear. A "linear" revision control system has only one type of a change which is the commit whereas "real world" problems have two different types: logical changes and individual commits that make up the logical change. This structure is implicit in the graph of merge-based systems, but what if we made it explicit? Thus if we have a commit graph that looks like this:

the linear version could look like this:

The two commits from the right branch have become one logical commit in the flat version. If the revision control system has a native understanding of these kinds of physical and logical commits all the problematic cases listed could be made to work transparently. For example bisection would work by treating all logical commits as only one change. Only after it has proven that the error occurred inside a single logical commit would bisection look inside it.

This, by itself, does not fix bug tracing. As there are no merges you can't know which branches have which fixes. This can be solved by giving each change (both physical and logical) a logical ID which remains the same over rebase and edit operations as opposed to the checksum-based commit ID which changes every time the commit is edited. This changes the tracking question from "which release branches have merged this feature fix branch" to "which release branches have a commit with this given logical ID" which is a fairly simple problem to solve.

This approach is not new. LibreOffice has tooling on top of Git that does roughly the same thing as discussed here. It is implemented as freeform text in commit messages with all the advantages and disadvantages that brings.

One obvious question that comes up is could you have logical commits inside logical commits. This seems like an obvious can of worms. On one hand it would be mathematically symmetrical and all that but on the other hand it has the potential to devolve into full Inception, which you usually want to avoid. You'd probably want to start by prohibiting that and potentially permitting it later once you have more usage experience and user feedback.

Could this actually work?

Maybe. But the real question is probably "could a system like this replace Git" because that is what people are using. This is trickier. A key question would whether you can automatically convert existing Git repos to the new format with no or minimal loss of history. Simple merges could maybe be converted in this way but in practice things are a lot more difficult due to things like octopus merges. If the conversion can not be done, then the expected market share is roughly 0%.

Wednesday, July 1, 2020

What is best in open source projects?

Open source project maintainers have a reputation of being grumpy and somewhat rude at times. This is a not unexpected as managing an open source project can be a tiring experience. This can lead to exhaustion and thus to sometimes being a bit too blunt.

But let's not talk about that now.

Instead, let's talk about the best of times, the positive outcomes, the things that really make you happy to be running an open source project. Patches, both bug fixes and new features are like this. So is learning about all the places people are using your project. Even better if they are using it ways you could not even imagine when you started. All of these things are great, but they are not the best.

The greatest thing is when people you have never met or even heard of before come to your project and then on their own initiative take on leadership in some subsection in the project.

The obvious thing to do is writing code, but this also covers things like running web sites, proofreading documentation, wrangling with CI, and even helping other projects to start using your project. At this point I'd like to personally list all the people who have contributed to Meson in this way but it would not be fair as I'd probably miss out some names. More importantly this is not really limited to any single project. Thus, I'd like to send out the following message to everyone who has ever taken ownership of any part of an open source project:

Keep on rocking. You people are awesome!

Tuesday, June 16, 2020

The human brain comprehends atoms but it does not comprehend electrons

The most common thing people know about software projects is that they fail. They might get delayed for months and years. They might be ten times more expensive than expected. They might be completely unusable pieces of garbage. And so on. Over the past 50 or so years many theories have been presented on why that is and how things could be made better. A lot of progress has been made, but no fundamental breakthrough has been made. Software projects still fail fairly often for unexpected reasons.

Typically this implies that there is some fundamental issue (or, more likely, several of them) causing these problems. Since nobody really knows what the real cause is, we are free to wildly speculate and hypothesize that the problem lies somewhere in the human brain. Not any single person's brain, mind you, but fundamental properties of the brain itself.

What the brain does incredibly well

The human brain is an incredible piece of work. Millions of years of evolution has adapted it to be able to be able to do astonishing things in the blink of an eye. The fact that you can walk and talk at the same time without needing to use any conscious effort on it is nothing short of amazing. As a concrete example, look at the following picture for a couple of seconds and then continue reading.

Just a quick glimpse should give you a fairly good idea about the quality of these two very different vehicles. These include things like top speed, acceleration, insurance costs, ride comfort, perceived increase in sex appeal, fuel consumption, maintenance effort, reliability, how long they are expected to keep working and so on. The estimates you get are probably not 100% accurate, but altogether they are probably mostly correct. Given your current life situation you probably also immediately "know" which one of these two would be better for you or that neither is suitable and that you need something completely different, such as a van or a bicycle.

This is your brain doing the thing it does best: examining physical objects in the real world. This works because things that are made of atoms have noticeable and persistent properties. These include things like mass, durability, sharpness and so on. This consistency allows one to fairly accurately estimate their behaviour even in unexpected situations.

And where it stumbles

If you ever find yourself bored at a party, try the following. Find some people, explain the Monty Hall Problem to them and give the correct answer. Then spend the rest of the evening watching how people will argue until they are blue in the face that you are wrong and that the probability is 50/50 and changing the door is not beneficial. This will happen even if (and especially when) participants have higher education degrees in STEM fields.

The Monty Hall Problem is a great example of the deficiencies of our brain. The actual problem is simple but the solution is so counter-intuitive that many people will refuse to accept it and will instead spend massive amounts of time and energy to disprove the point (in vain) because it just "does not feel right". In mathematics this happens fairly often. The reason this is relevant to the discussion at hand is that, at the end, that is what all software really is: applied mathematics. It does not obey the rules of the physical world and atoms, it only exists in the realm of logic. Computers don't deal with atoms, only electrons[1]. To see how this translates to coding, let's do a thought experiment.

Suppose your job is to paint a wall. After a hard day of painting you have ten meters of freshly painted wall. The next day you get to work and continue painting. At the end of the day you look at your work: exactly ten meters wall covered in paint. Puzzled you check everything but can't find anything wrong. You should have 20 meters of painted wall but don't. The next day the same thing happens again. This makes your manager unhappy so he appoints three people to help. The next day the four of you start painting, working as hard and as effectively as you possibly can. When the day comes to an end, you have seven meters of painted wall. Somehow three meters of paint you had already spplied have disappeared and nobody can explain why or even where the paint went. Determined to make up for the loss everyone works extra hard and at the end of the day, finally, nine meters of the wall is properly covered in paint. Everyone goes home exhausted but happy to have finally made progress. The next morning it is discovered that both the paint and the wall it was applied to have disappeared.

Finally, software

This is the essence of creating software. Since it exists purely in the realm of logic and mathematics, you can't really see it, smell it, touch it or feel it. It has failure modes unfathomable with physical processes and these problems occur fairly regularly. This means that all the builtin functionality of your brain is useless in evaluating the outcome and quality of a software project. Code can only experienced through thinking, which is slow, difficult and error prone. As opposed to cars, comparing two different code bases for quality and suitability is a big undertaking. But it gets worse.

The output of code, like the web browser you are using to read this blog post, look and feel a lot like physical objects made of atoms. This means that when people who have no personal experience in programming either buy or manage software projects, they are going to do it as if they were dealing with physical real world objects. That is what they are trained to do and have years of experience in after all. It is also actively harmful, because software is electrons and, as such, beyond the immediate instictive grasp of the human brain. It does not play by the rules of atoms and trying to make it do so will only lead to failure. 

[1] For the physicists among you, yes, technically this should be electric fields, not electrons. This is an artistic decision to make the headline more clickbaity.

Monday, June 8, 2020

Short term usability is not the same as long term usability

When designing almost any piece of code or other functionality, the issue of usability often comes up. Most of the time these discussions go well, but at other times fairly big schisms appear. It may even be that two people advocate for exactly opposite approaches, both claiming that their approach is better because of "usability". This seems like a paradox but upon closer examination we find that it's not for the simple reason that there is no one thing called usability. What is usable or not depends on a ton of different things.

Short term: maximal flexibility and power

As an example let's imagine a software tool that does something. What exactly it does is not important, but let's assume that it needs to be configurable and customizable. A classical approach to this is to create a small core and then add a configuration language for end users. The language may be Turing complete either by design or it might grow into Turing completeness accidentally when people add new functionality as needed. Most commonly this happens via some macro expansion functionality added to "reduce boilerplate".

Perhaps the best known example of this kind of tool is Make. At its core it is only a DAG solver and you can have arbitrary functionality just by writing shell script fragments direcly inside your Makefile. The underlying idea is to give developers the tools they need to solve their own special problems. These kinds of accidental programming environments are surprisingly common. I was told that a certain Large Software Company that provides services over the web did a survey on how many self created Turing complete configuration languages they have within their company. I never managed to get an actual number out of them, which should tell you that the real answer is probably "way too many".

Still, Make has stood the test of time as have many other projects using this approach. Like everything in life this architecture also has its downsides. The main one can be split in two parts, the first of which is this:
Even though this approach is easy to get started and you can make it do anything (that is what Turing completness does after all), it fairly quickly hits a performance ceiling. Changing the configuration and adding new features becomes ever more tedious as the size of the project grows. This is known as the Turing tarpit, where everything is possible but nothing is easy.

In practice this seems to happen quite often. Some would say every time. To understand why, we need to remember Jussi's law of programmers:
The problem with programmers is that if you give them the chance, they will start programming.
An alternative formulation would be that if you give programmers the possibility to solve their own problems, they are going to solve the hell out of their own problems. These sorts of architectures play the strings of the classical Unix hacker's soul. There are no stupid limitations, instead the programmer is in charge and can fine tune the solution to match the exact specifics and details of their unique snowflake project. The fact that the environment was not originally designed for general programming is not a deterrent, but a badge of merit. Making software do things it was not originally designed to do is what the original hacker ethic was all about, after all. This means that every project ends up solving the same common problems in a different way. This is a violation of the DRY principle over multiple projects. Even though every individual project only solves any problem once, globally they get solved dozens or even hundreds of times. The worst part is that none of these solutions work together. Taking to arbitrary Make projects and trying to build one as a subproject of another will not work and will only lead to tears.

Long term: maximal cooperation

An alternative approach is to create some sort of a configuration interface which is specific to the problem at hand but prevents end users from doing arbitrary operations. It looks like this.
The main thing to note is that at the beginning this approach is objectively worse. If you start using one of these at the beginning it is entirely possible that it can't solve your problem, and you can't make it work on your own, you need to change the system (typically by writing and sending pull requests). This brings enormous pressure to use an existing tweakable system because even though it might be "worse" at least it works.

However once this approach gets out of its Death Valley and has support for the majority of things users need, then networks effects kick in and everything changes. It turns out that perceived snowflakeness is very different from actual snowflakeness. Most problems are in fact pretty much the same for most projects and the solutions for those are mostly boring. Once there is a ready made solution, people are quite happy to use that, meaning that any experience you have in one project is immediately transferable to all other projects using the same tool. Additionally as general programming has been prevented, you can be fairly sure [1] that no cowboy coder has gone and reimplemented basic functionality just because they know better or, more probably, have not read the documentation and realized that the functionality they want already exists.

[1] Not 100% sure, of course, because cowboy coders can be surprisingly persistant.

Wednesday, May 13, 2020

The need for some sort of a general trademark/logo license

As usual I am not a lawyer and this is not legal advice.

The problem of software licenses is fairly well understood and there are many established alternatives to choose from based on your needs. This is not the case for licenses governing assets such as images and logos and especially trademarks. Many organisations, such as Gnome and the Linux Foundation have their own trademark policy pages, but they seem to be tailored to those specific organizations. There does not seem to be a kind of a "General project trademark and logo license", for lack of a better term, that people could apply to their projects.

An example case: Meson's logo

The Meson build system's name is a registered trademark. In addition it has a logo which is not. The things we would want to do with it include:
  • Allow people to use to logo when referring to the project in an editorial fashion (this may already be a legal right regardless, in some jurisdictions at least, but IANAL and all that)
  • Allow people to use the logo in other applications that integrate with Meson, in particular IDEs should be able to use it in their GUIs
  • People should be allowed to change the image format to suit their needs, logos are typically provided as SVGs, but for icon use one might want to use PNG instead
  • People should not be allowed to use the logos and trademarks in a way that would imply they are endorsing any particular product or service
  • People should not be allowed to create and sell third party merchandising (shirts etc) using the logo
  • Achieve all this while maintaining compliance with DFSG, various corporate legal requirements, established best practices for trademark protection and all that.
Getting all of these at the same time is really hard. As an example the Creative Commons licenses can be split in two based on whether they permit commercial use. All those that do permit it fail because they (seem to) permit the creation and sales of third party merchandise. Those that prohibit commercial are problematic because they prevent companies from shipping a commercial IDE product that uses the logo to identify Meson integration (which is something we do want to support, that is what a logo is for after all). This could also be seen as discriminating against certain fields of endeavour, which is contrary to things like the GPL's freedom zero and DFSG guideline #6.

Due to this the current approach we have is that logo usage requires individual permission from me personally. This is an awful solution, but since I am just a random dude on the Internet with a lawyer budget of exactly zero, it's about the only thing I can do. What would be great is if the entities who do have the necessary resources and expertise would create such a license and would then publish it freely so FOSS projects could just use it just as easily as picking a license for their code.

Monday, May 11, 2020

Enforcing locking with C++ nonmovable types

Let's say you have a struct with some variable protected by a mutex like this:

struct UnsafeData {
  int x;
  std::mutex ;

You should only be able to change x when the mutex is being held. A typical solution is to make x private and then create a method like this:

void UnsafeData::set_x(int newx) {
  // WHOOPS, forgot to lock mutex here.
  x = newx;

It is a common mistake that when code is changed, someone, somewhere forgots to add a lock guard. The problem is even bigger if the variable is a full object or a handle that you would like to "pass out" to the caller so they can use it outside the body of the struct. This caller also needs to release the lock when it's done.

This brings up an interesting question: can we implement a scheme which only permits safe accesses to the variables in a way that the users can not circumvent [0] and which has zero performance penalty compared to writing optimal lock/unlock function calls by hand and which uses only standard C++?

Initial approaches

The first idea would be to do something like:

int& get_x(std::lock_guard<std::mutex> &lock);

This does not work because the lifetimes of the lock and the int reference are not enforced to be the same. It is entirely possible to drop the lock but keep the reference and then use x without the lock by accident.

A second approach would be something like:

struct PointerAndLock {
  int *x;
  std::lock_guard<std::mutex> lock;

PointerAndLock get_x();

This is better, but does not work. Lock objects are special and they can't be copied or moved so for this to work the lock object must be stored in the heap, meaning a call to new. You could pass that in as an out-param but those are icky. That would also be problematic in that the caller creates the object uninitialised, meaning that x points to garbage values (or nullptr). Murpy's law states that sooner or later one of those gets used incorrectly. We'd want to make these cases impossible by construction.

The implementation

It turns out that this has not been possible to do until C++ added the concept of guaranteed copy elision. It means that it is possible to return objects from functions via neither copy or a move. It's as if they were automatically created in the scope of the calling function. If you are interested in how that works, googling for "return slot" should get you the information you need.  With this the actual implementation is not particularly complex. First we have the data struct:

struct Data {
    friend struct Proxy;
    Proxy get_x();

    int x;
    mutable std::mutex m;

This struct only holds the data. It does not manipulate it in any way. Every data member is private, so the struct itself and its Proxy friend can poke them directly. All accesses go via the Proxy struct, whose implementation is this:

struct Proxy {
    int &x;

    explicit Proxy(Data &input) : x(input.x), l(input.m) {}

    Proxy(const Proxy &) = delete;
    Proxy(Proxy &&) = delete;
    Proxy& operator=(const Proxy&) = delete;
    Proxy& operator=(Proxy &&) = delete;

    std::lock_guard<std::mutex> l;

This struct is not copyable or movable. Once created the only things you can do with it are to access x and to destroy the entire object. Thanks to guaranteed copy elision, you can return it from a function, which is exactly what we need.

The creating function is simply:

Proxy Data::get_x() {
    return Proxy(*this);

Using the result feels nice and natural:

void set_x(Data &d) {
    // d.x = 3 does not compile
    auto p = d.get_x();
    p.x = 3;

This has all the requirements we need. Callers can only access data entities when they are holding the mutex [1]. They do not and in deed can not release the mutex accidentally because it is marked private. The lifetime of the variable is tied to the life time of the lock, they both vanish at the exact same time. It is not possible to create half initialised or stale Proxy objects, they are always valid. Even better, the compiler produces assembly that is identical to the manual version, as can be seen via this handy godbolt link.

[0] Unless they manually reinterpret cast objects to char pointers and poke their internals by hand. There is no way to prevent this.

[1] Unless they manually create a pointer to the underlying int and stash it somewhere. There is no way to prevent this.

Monday, May 4, 2020

Let's talk meta

In my previous blog post about old techs causing problems in getting new developers on board. In it I had the following statement:
As a first order approximation, nobody under the age of 35 knows how to code in Perl, let alone would be willing to sacrifice their free time doing it.
When I wrote this, I spent a lot of time thinking whether I should add a footnote or extra sentence saying, roughly, that I'm not claiming that there are no people under 35 who know Perl, but that it is a skill that has gotten quite rare compared to ye olden times. The reason for adding extra text is that I feared that someone would inevitably come in and derail the discussion with some variation of "I'm under 35 and I know Perl, so the entire post is wrong".

In the end I chose not to put the clarification in the post. After all it was written slightly tongue-in-cheek, and even specifically says that this is not The Truth (TM), but just an approximation. The post was published. It got linked on a discussion forum. One of the very first comments was this:

This is what makes blogging on the Internet such a frustrating experience. Every single sentence you write has to be scrutinised from all angles and then padded and guarded so its meaning can not be sneakily undermined in this way. This is tiring, as it is difficult to get a good writing flow going. It may also make the text less readable and enjoyable. It makes blogging less fun and thus people less likely to want to do it.

An alternative to this is to not ready any comments. This works, but then you are flying blind. You can't tell what writing is good and which is not and you certainly can't improve. The Internet has ruined everything.


Contrary to the claim made above, the Internet has not, in fact, ruined everything. The statement is hyperbole, stemming from the author's feelings of frustration. In reality the Internet has improved the quality of life of most people on the earth by a tremendous amount and should be considered as one of the greatest inventions of mankind.

"Ye olden times" was not written as "├że olden times" because in the thorny battle between orthographic accuracy and readability the latter won.

The phrase "flying blind" refers neither to actual flying nor to actual blindness. It is merely a figure of speech for any behaviour that is done in isolation without external feedback. You should never operate any vehicle under any sort of vision impairment unless you have been specifically trained and authorized to do so by the appropriate authorities.


The notes above were not written because the author thought that readers would take the original statements literally. Instead they are there to illustrate what would happen if the defensive approach to writing, as laid out in the post, were taken to absurd extremes. It exists purely for the purposes of comedy. As does this chapter.

Saturday, May 2, 2020

You have to kill your perlings


This blog post deals only with the social and "human" aspects of various technologies. It is not about the technical merits of any programming language or other such tech. If you intend to write a scathing Reddit comment along the lines of "this guy is an idiot, Perl is a great language for X because you can do Y, Z and W", please don't. That is not the point of this post. Perl was chosen as the canonical example mostly due to its prevalence, the same points apply for things like CORBA, TCL, needing to write XML files by hand, ridiculously long compilation times and so on.

What is the issue at hand?

The 90s and early 2000s a lot of code was written. As was fashionable at the time, a lot of it was done using Perl. As open source projects are chronically underfunded, a lot of that code is still running. In fact a lot of the core infrastructure of Debian, the BSDs and other such foundational projects is written in Perl. When told about this, many people give the "project manager" reply and say that since the code exists, works and is doing what it should, everything is fine. But it's really not, and to find out why, let's look at the following graph.

Graph of number of people capable and willing to work on Perl. The values peak at 2000 and plummet to zero by 2020.

As we can see the pool of people willing to work on Perl projects is shrinking fast. This is a major problem for open source, since a healthy project requires a steady influx of new contributors, developers and volunteers. As a first order approximation, nobody under the age of 35 knows how to code in Perl, let alone would be willing to sacrifice their free time doing it.

One could go into long debates and discussions about why this is, how millennials are killing open source and how everyone should just "man up" and start writing sigils in their variable names. It would be pointless, though. The reasons people don't want to do Perl are irrelevant, the only thing that matters is that the use of Perl is actively driving away potential project contributors. That is the zeitgeist. The only thing you can do is to adapt to it. That means migrating your project from Perl to something else.

But isn't that risky and a lot of work?

Yes. A lot of Perl code is foundational. In many cases the people who wrote it have left and no-one has touched it in years. Changing it is risky. No matter how careful you are, there will be bugs. Nasty bugs. Hard to trace bugs. Bugs that work together with other bugs to cancel each other out. It will be a lot of hard work, but that is the price you have to pay to keep your project vibrant.

An alternative is to do nothing. If your project never needs to change, then this might be a reasonable course of action. However if something happens and major changes are needed (and one thing we have learned this year is that unexpected things actually do happen) then you might end up as the FOSS equivalent of the New Jersey mayor trying to find people to code COBOL for free.

Sunday, April 19, 2020

Do humans or compilers produce faster code?

Modern optimizing compilers are truly amazing. They have tons and tons of tricks to make even crappy code run incredibly fast. Faster than most people could write by hand. This has lead some people to claim that program optimization is something that can be left to compilers, as they seem to be a lot better at it. This usually sparks a reply from people on the other end of the spectrum that say that they can write faster code by hand than any compiler which makes compilers mostly worthless when performance actually matters.

In a way both of these viewpoints are correct. In another way they are both wrong. To see how, let's split this issue into two parts.

A human being can write faster code than a compiler for any given program

This one is fairly easy to prove (semi)formally. Suppose you have a program P written in some programming language L that runs faster than any hand written version. A human being can look at the assembly output of that program and write an equivalent source version in straight C. Usually when doing this you find some specific optimization that you can add to make the hand written version faster.

Even if the compiler's output were proved optimal (such as in the case of superoptimization), it can still be matched by copying the output into your own program as inline assembly. Thus we have proven that for any program humans will always be faster.

A human being can not write faster code than a compiler for every program

Let's take something like Firefox. We know from the previous chapter that one could eschew complex compiler optimizations and rewrite it in straight C or equivalent and end up with better performance. The downside is that you would die of old age before the task would be finished.

Human beings have a limited shelf life. There are only so many times they can press a key on the keyboard until they expire. Rewriting Firefox to code that works faster with straight C than the current version with all optimizations enabled is just too big of a task.

Even if by some magic you could do this, during the rewrite the requirements on the browser would change. A lot. The end result would be useless until you add all the new functionality that was added since then. This would lead to eternal chasing of the tail lights of the existing project.

And even if you could do that, optimizing compilers keep getting better, so you'd need to go through your entire code base regularly and add the same optimizations by hand to keep up. All of these things could be done in theory, but they completely impossible in practice.

The entire question poorly posed

Asking whether compilers and humans write faster code is kind of like asking which one is "bluer", the sea or the sky. Sure you could spend years debating the issue on Twitter without getting anywhere, but it's not particularly interesting. A more productive way is to instead ask the question "given the requirements, skills and resources I have available, should I hand-optimize this particular routine or leave it to the compiler".

If you do this you rediscover the basic engineering adage: you get the best bang for the buck by relying on the compiler by default and doing things by hand only for bottlenecks that are discovered by profiling the running application.

PS. Unoptimized C is slow, too

Some people think that when they write C it is "very close" to the underlying assembly and thus does not benefit much from compiler optimizations. This has not been true for years (possibly decades). The performance difference between no optimization and -O2 can be massive, especially for hot inner loops.

When people say that they can write code that is faster than compiler optimized version of the same algorithm in some other language, that is not what they are actually saying. Unless they are writing 100% pure ASM by hand [0] that is not what they are saying. Instead they are saying "I can take any algorithm implementation, write it with an alternative syntax and, when both of these are run through their respective optimizing compilers, end up with a program that runs faster".

[0] Which does happen sometimes, especially for SIMD code.