Wednesday, July 29, 2020

About that "Google always builds everything from source every time" thing

One of the big battles of dependencies is whether you should use somehow prebuilt libraries (e.g. the Debian-style system packaging) or vendor all the sources of your deps and build everything yourself. Whenever this debat gets going, someone is going to do a "well, actually" and say some kind of variation of this:
Google vendors all dependencies and builds everything from source! Therefore that is clearly the right thing to do and we should do that also.
The obvious counterargument to this is the tried-and-true if all your friends jumped off a bridge would you do it too response known by every parent in the world. The second, much lesser known counterargument is that this statement is not actually true.

Google does not actually rebuild all code in their projects from source. Don't believe me? Here's exhibit A:


The original presentation video can be found here. Note that the slide and the video speak of multiple prebuilt dependencies, not just one [0]. Thus we find that even Google, with all of their power, money, superior engineering talent and insistence to rebuild everything from source, do not rebuild everything from source. Instead they have to occasionally use prebuilt third party libraries just like the rest of the world. Thus a more accurate form of the above statement would be this:
Google vendors most dependencies and builds everything from source when possible and when that is not the case they use prebuilt libraries but try to keep quiet about it in public because it would undermine their publicly made assertion that everyone should always rebuild everything from source.
The solution to this is obvious: you just rebuild all things that you can from sources and get the rest as prebuilt libraries. What's the big deal here? By itself there would not be, but this ideology has consequences. There are many tools and even programming languages designed nowadays that only handle the build-from-source case because obviously everyone has the source code for all their deps. Unfortunately this is just not true. No matter how bad prebuilt no-access-to-source libraries are [1], they are also a fact of life and must be natively supported. Not doing that is a major adoption blocker. This is one of the unfortunate side effects of dealing with all the nastiness of the real world instead of a neat idealized sandbox.

[0] This presentation is a few years old. It is unknown whether there are still prebuilt third party libraries in use.

[1] Usually they are pretty bad.

Sunday, July 26, 2020

Pinebook Pro longer term usage report

I bought a Pinebook Pro in the first batch, and have been using it on and off for several months now. Some people I know wanted to know if it is usable as a daily main laptop.

Sadly, it is not. Not for me at least. It is fairly close though.

Random bunch of things that are annoying or broken

I originally wanted to use stock Debian but at some point the Panfrost driver broke and the laptop could not start X. Eventually I gave up and switched to the default Manjaro. Its installer does not support an encrypted root file system. A laptop without an encrypted disk is not really usable as a laptop as you can't take it out of your house.

The biggest gripe is that everything feels sluggish. Alt-tabbing between Firefox and a terminal takes one second, as does switching between Firefox tabs. As an extreme example switching between channels in Slack takes five to ten seconds. It is unbearably slow. The wifi is not very good, it can't connect reliably to an access point in the next room (distance of about 5 meters). The wifi behaviour seems to be distro dependent so maybe there are some knobs to twiddle.

Video playback on browsers is not really nice. Youtube works in the default size, but fullscreen causes a massive frame rate drop. Fullscreen video playback in e.g. VLC is smooth.

Basic shell operations are sluggish too. I have a ZSH prompt that shows the Git status of the current directory. Entering in a directory that has a Git repo freezes the terminal for several seconds. Basically every time you need to get something from disk that is not already in cache leads to a noticeable delay.

The screen size and resolution scream for fractional scaling but Manjaro does not seem to provide it. Scale of 1 is a bit too small and 2 is way too big. The screen is matte, which is totally awesome, but unfortunately the colors are a bit muted and for some reason it seems a bit fuzzy. This may be because I have not used a sub-retina level laptop displays in years.

The trackpad's motion detector is rubbish at slow speeds. There is a firmware update that makes it better but it's still not great. According to the forums someone has already reverse engineered the trackpad and created an unofficial firmware that is better. I have not tried it. Manjaro does not provide a way to disable tap-to-click (a.k.a. the stupidest UI misfeature ever invented including the emojibar) which is maddening. This is not a hardware issue, though, as e.g. Debian's Gnome does provide this functionality. The keyboard is okayish, but sometimes detects keypresses twice, which is also annoying.

For light development work the setup is almost usable. I wrote a simple 3D model viewer app using Qt Creator and it was surprisingly smooth all round, the 3D drivers worked reliably and so on. Unfortunately invoking the compiler was again sluggish (this was C++, though, so some is expected). Even simple files that compile instantly on x86_64 took seconds to build.

Can the issues be fixed?

It's hard to say. The Panfrost driver is under heavy development, so it will probably keep getting better. That should fix at least the video playback issues. Many of the remaining issues seem to be on the CPU and disk side, though. It is unknown whether there are enough optimization gains to be had to make the experience fully smooth and, more importantly, whether there are people doing that work. It seems feasible that the next generation of hardware will be fast enough for daily usage.

Bonus content: compiling Clang

Just to see what would happen, I tried whether it would be possible to compile Clang from source (it being the heaviest fairly-easy-to-build program that I know of). It turns out that you can, here are the steps for those who want to try it themselves:
  • Checkout Clang sources
  • Create an 8 GB swap file and enable it
  • Configure Clang,  add -fuse-ld=gold to linker flags (according to Clang docs there should be a builtin option for this but in reality there isn't) and set max parallel link jobs to 1
  • Start compilation with ninja -j 4 (any more and the compilation jobs cause a swap storm)
  • If one of the linker jobs cause a swap storm, kill the processes and build the problematic library by hand with ninja bin/libheavy.so
  • Start parallel compilation again and if it freezes, repeat as above
After about 7-8 hours you should be done.

Sunday, July 19, 2020

The ABI stability matryoshka

In the C++ on Sea conference last week Herb Sutter had a talk about replacing an established thingy with a new version. Obviously the case of ABI stability came up and he answered with the following (the video is not available so this quote is only approximate, though there is an earlier version of the talk viewable here):
Backwards compatibility is important so that old code can keep working. When upgrading to a new system it would be great if you could voluntarily opt into using the old ABI. So far no-one has managed to do this but if we could crack this particular nut, making major ABI changes would become a lot easier.
Let's try to do exactly that. We'll start with a second (unattributed) quote that often gets thrown around in ABI stability discussions:
Programming language specifications do not even acknowledge the existance of an ABI. It is wholly a distro/tool vendor problem and they should be the ones to solve it.
Going from this we can find out the actual underlying problem, which is running programs of two different ABI versions at the same time on the same OS. The simple solution of rebuilding the world from scratch does not work. It could be done for the base platform but, due to business and other reasons, you can't enforce a rebuild of all user applications (and those users, lest we forget, pay a very hefty amount of money to OS vendors for the platform their apps run on). Mixing new and old ABI apps is fragile and might fail due to the weirdest of reasons no matter how careful you are. The problem is even more difficult in "rolling release" cases where you can't easily rebuild the entire world in one go such as Debian unstable, but we'll ignore that case for now.

It turns out that there already exists a solution for doing exactly this: Flatpak. Its entire reason of existance is to run binaries with different ABI (and even API) on a given Linux platform while making it appear as if it was running on the actual host. There are other ways of achieving the same, such as Docker or systemd-nspawn, but they aim to isolate the two things from each other rather than unifying them. Thus a potential solution to the problem is that whenever an OS breaks ABI compatibility in a major way (which should be rare, like once every few years) it should provide the old ABI version of itself as a Flatpak and run legacy applications that way. In box diagram architecture format it would look like this:


The main downside of this is that the OS vendor's QA department has twice as much work as they need to validate both ABI versions of the product. There is also probably a fair bit of work work to make the two version work together seamlessly, but once you have that you can do all sorts of cool things, such as building the outer version with stdlibc++'s debug mode enabled. Normally you can't do that easily as it massively breaks ABI, but now it is easy. You can also build the host with address or memory sanitizer enabled for extra security (or just debugging).

If you add something like btrfs subvolumes and snapshotting and you can do all sorts of cool things. Suppose you have a very simple system with a web server and a custom backend application that you want to upgrade to the new ABI version. It could go something like this:

  1. Create new btrfs subvolume, install new version to that and set up the current install as the inner "Flatpak" host.
  2. Copy all core system settings to the outer install.
  3. Switch the main subvolume to the new install, reboot.
  4. Now the new ABI environment is running and usable but all apps still run inside the old version.
  5. Copy web server configuration to the outer OS and disable the inner one. This is easy because the all system software has the exact same version in both OS installs. Reboot.
  6. Port the business app to run on the new ABI version. Move the stored data and configuration to the outer version. The easiest way to do this is to have all this data on its own btrfs subvolume which is easy to switch over.
  7. Reboot. Done. Now your app has been migrated incrementally to the new ABI without intermediate breakage (modulo bugs).
The best part is that if you won't or can't upgrade your app to the new ABI, you can stop at step #5 and keep running the old ABI code until the whole OS goes out of support. The earlier ABI install will remain as is, can be updated with new RPMs and so on. Crucially this will not block others from switching to the new ABI at their leisure. Which is exactly what everyone wanted to achieve in the first place.

Tuesday, July 7, 2020

What if? Revision control systems did not have merge

A fun design exercise is to take an established system or process and introduce some major change into it, such as adding a completely new constraint. Then take this new state of things, run with it and see what happens. In this case let's see how one might design a revision control system where merging is prohibited. Or, formulated in a slightly different way:
What if merging is to revision control systems as multiple inheritance is to software design?

What is merging used for?

First we need to understand what merging is used for so that wa can develop some sort of a system that achieves the same results via some other mechanism. There are many reasons to use merges, but the most popular ones include the following.

An isolated workspace for big changes

Most changes are simple and consists of only one commit. Sometimes, however, it is necessary to make big changes with intermediate steps, such as doing major refactoring operations. These are almost always done in a branch and then brought in to trunk. This is especially convenient if multiple people work on the change.

Trunk commits are always clean

Bringing big changes in via merges means that trunk is always clean and buildable. More importantly bisection works reliably since all commits in trunk are known good. This is typically enforced via a gating CI. This allows big changes to have intermediate steps that are useful but broken in some way so they would not pass CI. This is not common, but happens often enough to be useful.

An alternative to merging is squashing the branch into a single commit. This is suboptimal as it destroys information breaking for example git blame -kind of functionality as all changes made point to a single commt made by a single person (or possibly a bot).

Fix tracking

There are several systems that do automatic tracking of bug fixes to releases. The way this is done is that a fix is written in its own branch. The bug tracking system can then easily see when the fix gets to the various release branches by seeing when the bugfix branch has been merged to them.

A more linear design

In practice many (possibly even most) projects already behave like this. They keep their histories linear by rebasing, squashing and cherry picking, never merging. This works but has the downsides mentioned above. If one spends some time thinking about this problem the fundamental disconnect comes fairly clear. A "linear" revision control system has only one type of a change which is the commit whereas "real world" problems have two different types: logical changes and individual commits that make up the logical change. This structure is implicit in the graph of merge-based systems, but what if we made it explicit? Thus if we have a commit graph that looks like this:



the linear version could look like this:


The two commits from the right branch have become one logical commit in the flat version. If the revision control system has a native understanding of these kinds of physical and logical commits all the problematic cases listed could be made to work transparently. For example bisection would work by treating all logical commits as only one change. Only after it has proven that the error occurred inside a single logical commit would bisection look inside it.

This, by itself, does not fix bug tracing. As there are no merges you can't know which branches have which fixes. This can be solved by giving each change (both physical and logical) a logical ID which remains the same over rebase and edit operations as opposed to the checksum-based commit ID which changes every time the commit is edited. This changes the tracking question from "which release branches have merged this feature fix branch" to "which release branches have a commit with this given logical ID" which is a fairly simple problem to solve.

This approach is not new. LibreOffice has tooling on top of Git that does roughly the same thing as discussed here. It is implemented as freeform text in commit messages with all the advantages and disadvantages that brings.

One obvious question that comes up is could you have logical commits inside logical commits. This seems like an obvious can of worms. On one hand it would be mathematically symmetrical and all that but on the other hand it has the potential to devolve into full Inception, which you usually want to avoid. You'd probably want to start by prohibiting that and potentially permitting it later once you have more usage experience and user feedback.

Could this actually work?

Maybe. But the real question is probably "could a system like this replace Git" because that is what people are using. This is trickier. A key question would whether you can automatically convert existing Git repos to the new format with no or minimal loss of history. Simple merges could maybe be converted in this way but in practice things are a lot more difficult due to things like octopus merges. If the conversion can not be done, then the expected market share is roughly 0%.

Wednesday, July 1, 2020

What is best in open source projects?

Open source project maintainers have a reputation of being grumpy and somewhat rude at times. This is a not unexpected as managing an open source project can be a tiring experience. This can lead to exhaustion and thus to sometimes being a bit too blunt.

But let's not talk about that now.

Instead, let's talk about the best of times, the positive outcomes, the things that really make you happy to be running an open source project. Patches, both bug fixes and new features are like this. So is learning about all the places people are using your project. Even better if they are using it ways you could not even imagine when you started. All of these things are great, but they are not the best.

The greatest thing is when people you have never met or even heard of before come to your project and then on their own initiative take on leadership in some subsection in the project.

The obvious thing to do is writing code, but this also covers things like running web sites, proofreading documentation, wrangling with CI, and even helping other projects to start using your project. At this point I'd like to personally list all the people who have contributed to Meson in this way but it would not be fair as I'd probably miss out some names. More importantly this is not really limited to any single project. Thus, I'd like to send out the following message to everyone who has ever taken ownership of any part of an open source project:

Keep on rocking. You people are awesome!