Sunday, April 24, 2016

Rewriting from scratch, should you do it?

One thing that has bothered me for a while is that the unzip command line tool only decompresses one file at a time. As a weekend project I wanted to see if I could make it work in parallel. One could write this functionality from scratch but this gave me a possibility to try really look into incremental development.

All developers love writing stuff from scratch rather than fixing and existing solutions (yes, guilty as charged). However the accepted wisdom is that you should never do a from scratch rewrite but instead improve what you have via incremental improvements. Thus I downloaded the sources of Info-Zip and got to work.

Info-Zip's code base is quite peculiar. It predates things such as ANSI C and has support for tons of crazy long dead hardware. MS-DOS ranks among the most recently added platforms. There is a lot of code for 16 bit processors, near and far pointers and all that fun stuff your grandad used to complain about. There are even completely bizarre things such as this snippet:

#ifndef const
#  define const const
#endif

The code base contains roughly 80 000 lines of K&R C. This should prove an interesting challenge. Those wanting to play along can get the code from Github.

Compiling the code turned out to be quite simple. There is no configure script or the like, everything is #ifdeffed inside the source files. You just compile them into an app and then you have a working exe. The downside is that the source has more preprocessor code than actual code (only a slight exaggaration).

Originally the code used a single (huge) global struct that houses everything. At some point the developers needed to make the code reentrant. Usually this means changing every function to take the global state struct as a function argument instead. These people chose not to do this. Instead they created a C preprocessor macro system that can be used to pass the struct as an argument but also compile the code so it has the old style global struct. I have no idea why they did that. The only explanation that makes any sort of sense is that adding the pointer to stack on every function call is too expensive on 16 bit and smaller platforms. This is just speculation, though, but if anyone knows for sure please let me know.

This meant that every single function definition was a weird concoction of preprocessor macros and K&R syntax. For details see this commit that eventually killed it.

Getting rid of all the cruft was not particularly difficult, only tedious. The original developers were very pedantic about flagging their #if/#endif pairs so killing dead code was straightforward. The downside was that what remained after that was awful. The code had more asterisks than letters. A typical function was hundreds of lines long. Some preprocessor symbols were defined in opposite ways in different header files but things worked because some other preprocessor clauses kept all but one from being evaluated (the code confused Eclipse's syntax highlighter so it's really hard to see what was really happening).

Ten or so hours of solid work later most dead cruft was deleted and the code base had shrunk to 30 000 lines of code. At this point looking into adding threading was starting to become feasible. After going through the code that iterates the zip index and extracts files it became a lot less feasible. As an example the inflate function was not isolated from the rest of the code. All its arguments were given in The One Big Struct and it fiddled with it constantly. Those would need to be fully separated to make anything work.

That nagging sound in your ear

While fixing the code I kept hearing the call of the rewrite siren. Just rewrite from scratch, it would say. It's a lot less work. Go on! Just try it! You know you want to!

Eventually the lure got too strong so I opened the Wikipedia page on Zip file format. Three hours and 373 lines of C++ later I had a parallel unzipper written from scratch. Granted it does not do advanced stuff like encryption, ZIP64 or creating subdirectories for files that it writes. But it works! Code is available in this repo.

Even better, adding multithreading took one commit with 22 additions and 7 deletions. The build definition is 10 lines of Meson instead of 1000+ lines of incomprehensible Make.

There really is no reason, business or otherwise, to modernise the codebase of Info-Zip. With contemporary tools, libraries and methodologies you can create code that is an order of magnitude simpler, clearer, more maintainable and just all around more pleasant to work with than existing code. In a fraction of the time.

Sometimes rewriting from scratch is the correct thing to do.

This is the exact opposite of what I set out to prove but that's research for you.

Update

The program in question has now been expanded to do full Zip packing + unpacking. See here for benchmarks.

Tuesday, April 19, 2016

A look into Linux software deployment

One of the great virtues of Linux distributions is that they create an easy way to install massive amounts of software. They provide an easy way to install a massive amount of software. They also provide automatic updates so the end user does not have to care. This mechanism is considered by many to be the best way to provide software for end users.

But is it really?

Discussions on this topic usually stir strong emotions, but let's try to examine this issue through practical use cases.

Suppose a developer releases a new game. He wants to have this game available for all Linux users immediately and, conversely, all Linux users want to play his game as soon as possible. Further, let's set a few more reasonable requirements for this use case:

  • The game package should be runnable immediately after the developer makes a release.
  • The end user should be able to install and play the game without requiring root privileges.
  • The end user should not be required to compile any source code.
  • The developer should be able to create only one binary package that will run on all distros.

These fairly reasonable requirements are easy to fulfill on all common platforms including Windows, Android, iOS and OSX. However on Linux this is not possible.

The delay between an application's release and having it available in distros is usually huge. The smallest time between releases is on Ubuntu at once every six months. That means that, on average, it takes three months from release into being usable. And this is assuming that the app is already in Debian/Ubuntu, gets packaged immediately and so on.

Some of you might be scoffing at this point. You might think that you just have to time the releases to right before the distro freezes and that this is not a problem. But it is. Even if you time your releases to Ubuntu (as opposed to, say, Fedora), the final freeze of Ubuntu causes complications. During freeze new packages are not allowed in the archive any more. This takes a month or more. More importantly, forcing app developers to time their releases on distro cadences is not workable, because app developers have their own schedules, deadlines and priorities.

There are many other reasons why these sorts of delays are a bad. Rather than go through them one by one, let's do a thought experiment instead. Suppose Steve Jobs were still alive and was holding a press conference on the next release of iOS. He finishes it up with this:

- "The next version of iOS will be released January. All applications that want to be in the app store must be submitted in full before the end of December. If you miss the deadline the next chance to add an app to the store will be in two years. One more thing: once the release ships you can't push updates to your apps. Enjoy the rest of the conference."

Even Distortion Field Steve himself could not pull that one off. He would be laughed off the stage. Yet this is the way the Linux community expects application deployment to be done.

A closer look at the numbers

Let's do a different kind of a thought experiment. Suppose I have a magic wand. I wave it once, which causes every application that is in the Android and iOS app stores to become free software. I wave it a second time and every one of those apps grows a Linux port with full Debian packaging and is then submitted to Debian. How long will it take until they are all processed and available in the archive?

Go ahead and do a rough evaluation. I'll wait.

The answer is obvious: it will never happen! There are two major reasons. First of all, the Debian project simply does not have enough resources to review tens of thousands of new packages. The second reason is that even if they did, no-one wants to review 20 different fart apps for DFSG violations.

Now you might say that this is a good thing and that nobody really needs 20 fart apps. This might be true but this limitation affects all other apps as well. If apps should be provided by distros then the people doing application review become a filter between application developers and end users. The only apps that have a chance of being accepted are those some reviewer has a personal interest in. All others get dropped to the sidelines. To an application developer this can be massively demoralizing. Why on earth should he fulfill the whims of a random third party just to be able to provide software to his end users?

There is also a wonderful analogy to censorship here, but I'll let you work that one out yourselves.

Javascript frameworks: same problem, different angle

Let's leave applications for a while and look at something completely different: web frameworks. There are several hundred of them and what most seem to have in common is that they don't come in distro packages. Instead you install them like this:

curl http://myawesomejsframework.com/installer.sh | sudo sh

This is, of course, horrible. Any person with even rudimentary knowledge of operating systems, security or related fields will recoil in horror when seeing unverified scripts being run with root privileges. And then they are told that this is actually going out of style and you should instead pull the app as a ready to run Docker container. That you just run. On your production servers. Without any checksumming or verification.

This is how you drive people into depression and liver damage.

After the initial shock passes the obvious followup question is why. Why on earth would people do this instead of using proven good distro packages? Usually this is accompanied by several words that start with the letter f.

If you stop and think about the issue objectively, the reason to do this is obvious.

Distro packages are not solving the problems people have.

This is so simple and unambiguous that people have a hard time grasping it. So here it is again, but this time written in bold:

Distro packages are not solving the problems people have!

Typical web app deployment happens on Debian stable, which gets released once every few years. This is really nice and cool, but the problem is that the web moves a lot faster. Debian's package versions are frozen upon release, however deploying a web app on a year old framework (let alone two or three) is not feasible. It is just too old.

The follow up to this is that you should instead "build your own deb packages for the framework". This does not work either. Usually newer versions of frameworks require newer versions of their dependencies. Those might require newer versions of their dependencies and so on. This is a lot of work, not to mention that the more system packages you replace the bigger risk you run of breaking some part of the base distro.

If you examine this problem with a wider perspective you find that these non-distro package ways of installing software exist for a reason. All the shell scripts that run as root, internal package managers, xdg-app, statically linked binaries, Docker images, Snappy packages, OSX app bundles, SteamOS games, Windows executables, Android and iOS apps and so on exist for only one reason:

They provide dependencies that are not available on the base system.

If the app does not provide the dependencies itself, it will not work.

But what does it really mean?

A wise man once said that the all wisdom begins by acknowledging the facts. Let's go through what we have learned thus far.

Fact #1: There are two systems at play: the operating system and the apps on top of it. These have massively different change rates. Distros change every few years. Apps and web sites change daily. Those changes must be deployable to users immediately.

Fact #2: The requirements and dependencies of apps may conflict with the dependencies of the platform. Enforcing a single version of a dependency across the entire system is not possible.

If you take these facts and follow them to their logical conclusion, we find that putting distro core packages and third party applications in the same namespace, which is what mandating the use of distro packages for everything does, can not work.

Instead what we need to accept is that any mechanism of app delivery must consist of two isolated parts. The first is the platform, which can be created with distro packages just like now. The second part is the app delivery on top. App and framework developers need to be able to provide self contained packages directly to end users. Requiring approval of any sort from an unrelated team of volunteers is a failure. This is basically what other operating systems and app stores have done for decades. This approach has its own share of problems, the biggest of which is apps embedding unsafe versions of libraries such as OpenSSL. This is a solvable problem, but since this blog posting is already getting to be a bit too long I refer you to the link at the end of this post for the technical details.

Switching away from the current Linux packaging model is a big change. Some of it has already started happening in the form of initiatives such as xdg-app, Snappy and Appimage. Like any change this is going to be painful for some, especially distributions whose role is going to diminish. The psychological change has already taken hold from web frameworks (some outright forbid distros from packaging them) to Steam games. Direct deployment seems to be where things are headed. When the world around you changes you can do one of two things.

Adapt or perish.

Closing words

If you are interested in the mechanics of building app packages and providing security and dependencies for them watch this presentation on the subject I held at LCA 2016.

Sunday, April 10, 2016

Testing performance of build optimisations

This blog post examines the performance effect of various compiler options. The actual test consists of compressing a 270 megabyte unpacked tar file with zlib. For training data of profile guided optimisation we used a different 5 megabyte tar file.

All measurements were done five times and the fastest time of each run was chosen. Both the zlib library and the code using it is compiled from scratch. For this we used the Wrap dependency system of the Meson build system. This should make the test portable to all platforms (we tested only Linux + GCC). The test code is available on Github.

We used two different machines for testing. The first one is a desktop machine with Intel i7 running latest Ubuntu and GCC 5.2.1. The other machine is a Raspberry Pi 2 with Raspbian and GCC 4.9.2. All tests were run with basic optimization (-O2) and release optimization (-O3).

Here are the results for the desktop machine. Note the vertical scale does not go down to zero to make the differences more visible. The first measurement uses a shared library for zlib, all others use static linking.


Overall the differences are quite small, the difference between the fastest and slowest time is roughly 4%. Other things to note:

  • static libraries are noticeably faster than shared ones
  • -O2 and -O3 do not have a clear performance order, sometimes one is faster, sometimes the other
  • pgo seems to have a bigger performance impact than lto
The times for Raspberry Pi look similar.



The performance difference between the fastest and slowest options is 5%, which is roughly the same as on the desktop. On this platform pgo also has a bigger impact than lto. However here lto has a noticeable performance impact (though it is only 1%) as opposed to the desktop, where it was basically unmeasurable.

Sunday, April 3, 2016

Simple new features of the Meson build system

One of the main design goals of Meson has been to make the 95% use case as simple as possible. This means that rather than providing a framework that people can use to solve their problems themselves, we'd rather just provide the solution.

For a simple example let's look at using the address sanitizer. On other build systems you usually need to copy a macro that someone else has written into your build definition or write a toggle that adds the compiler flags manually. This is not a lot of work but everyone needs to do it and these things add up. Different projects might write different options for this feature so when switching between projects you always need to learn and remember what the option was called on each.

At its core this really is a one bit issue. Either you want to use address sanitizer or you don't. With this in mind, Meson provides an interface that is just as simple. You can either enable address sanitizer during build tree configuration:

meson -Db_sanitizer=address <source_dir> <build_dir> 

Or you can configure it afterwards with the configuration tool:

mesonconf -Db_sanitizer=address <build_dir>

There are a bunch of other options that you can also enable with these commands (for the remainder of the document we only use the latter).

Link-time optimization:

mesonconf -Db_lto=true <build_dir>

Warnings as errors:

mesonconf -Dwerror=true <build_dir>

Coverage results:

mesonconf -Db_coverage=true <build_dir>

Profile guided optimization:

mesonconf -Dd_pgo=generate (or =use) <build_dir>

Compiler warning levels (1 is -Wall, 3 is almost everything):

mesonconf -Dwarning_level=2 <build_dir>

The basic goal of this option system is clear: you, the user, should only describe what you want to happen. It is the headache of the build system to make it happen.