Wednesday, February 6, 2019

On possible futures of Meson

At FOSDEM I talked to a bunch of people about an issue that has been brought up a couple of times recently, specifically that of integrating Rust in existing code bases. Some projects are looking into converting parts (or presumably eventually everything) of their code to Rust. A requirement of this is that for some time there need to be both Rust and C or whatever language within one project at the same time. (The rest of this blog post will use Rust as an example, but the same issues are present in all modern programming languages that have the same build system/dependency setup. In practice this means almost all of them.)

This would not be such a problem except that Rust by itself has pretty much nothing in the standard library and you need to get many crates via Cargo for even fairly simple programs. Several people do not seem particularly thrilled about this for obvious reasons, but have given up on this battle because "in practice it's impossible to develop in Rust without using Cargo" or words to that effect. As the maintainer of Meson, they obviously come to me with the integration problem. Meson does support compiling Rust directly, but it does not go through Cargo.

This is where I'm told to "just call Cargo" instead. There are two major problems with this. The first one is technical and has to do with the fact that having two different build systems and dependency managers in one build directory does not really work. We're not going to talk about this issue in this blog post, interested people can find writings about this issue using their favorite bingoogle. The second issue is non-technical, and the more serious one.

This approach means the death of Meson as a project

If we assume (as seems to be the case currently) that new programming languages such as Rust take over existing languages and that they all have their own build system, then Meson becomes unnecessary. Having a build system whose only task is to call a different build system is useless. It is nothing but bloat that should be deleted. As the person whose (unpaid) job it is to care about the long term viability of Meson, this does not make me feel particularly good.

So what might the future might in hold? Let's look at some alternatives.

New languages take over and keep their own build systems

New things take over, old code gets rewritten or thrown away. Each language keeps living in its own silo. Language X is better than language Y battles become the new Emacs vs vi battles. Meson becomes obsolete and gets shipped to the farm where it joins LISP machines, BeOS and Amigas as technology that was once interesting and nice but then was forgotten.

This is where things are currently heading. It is also the most likely outcome.

New languages adopt Meson as their build system

Rather than building and maintaining their own build systems, new languages switch to using Meson so everyone has a shared system and things work together. This will not happen. The previous statement is so important that I'll write it here for a second time, but this time in bold and on its own line:

This will not happen!

Every new language wants to provide a full toolchain and developer experience on their own. This is actually a good choice because it means that they can provide a full dev experience by themselves without depending on any external tooling (especially one that is written in a different programming language). Being able to e.g. just install one MSI package on Windows and have the full dev experience there is a really nice thing to have. Because of this no new programming language will accept an external build tool. No way. No how.

Meson becomes a glue layer to combine programming languages

A common suggested solution is that language native build systems export their state somehow so that external tools can drive them. Meson would then take these individual projects and make them work together. On the face of it this seems simple and workable and allows mixing languages freely. But if you think about it more, this is not a great place to be in. Basically it means that your project's purpose changes from being a build system to being responsible of all project integration but who does not have any say on how builds happen. There is also no leverage on the actual toolchain developers.

A condensed description of this task would be integration hell as a service. It is unlikely that anyone would want to spend their free time doing that. I sure as hell don't. I deal with enough of that stuff at work already.

So is everything terrible and all hope lost forever?

Maybe.

But then again maybe not.

Meson the language is not Meson the implementation

Meson was designed from the beginning to be programming language independent. This means both the languages it supports and the language it is implemented in. It does not leak the fact that it's implemented in Python to the build definition language. The core Meson functionality is only a few thousand lines of Python, reimplementing it is not a massive task. There is nothing wrong with having multiple implementations. In fact it is encouraged. Feel like rewriting it in Rust? Go for it (pun not intended).

In this approach new programming languages would get to keep their own toolchains, installs and build systems. The only thing that would change is the build description language. The transition would take a fair bit of work to be sure, but the benefit is that all that code becomes easily integratable with other languages. The reference implementation of language X could even choose to only support its own language and hand multi-language projects off to the reference implementation.

There are other advantages as well. For example currently (or the last time I looked, which admittably was some time ago) Cargo farms almost all non-standard work to build.rs, leading to wheel reinvention. Meson has native support for most of the things a project might need (source code generation etc). Thus each language would not need to design their own DSL for describing these, but could instead take Meson's and implement it in their own tools. Meson also has a lot of functionality for things like creating Python modules, which you would also get for free.

But the main advantage of all of this would be cooperation. You could mix and match languages with ease (assuming a working API/ABI, but that is outside the build system's purview) and switch between different languages without needing to learn yet another build definition language and toolchain behaviour.

The main downside of this approach is that it can not be injected from "the outside". It would only work if there are people inside the language communities that would consider this a worthwhile thing to do and would drive the change from within. Even for them it would probably be a long battle, because these sorts of changes have traditionally met with a lot of resistance.

How will all this affect Meson development?

In the short term probably not in any noticeable way.

And in the long term?


Saturday, February 2, 2019

Meson's new logo: a design process

At FOSDEM, quite literally only a few moments ago, we published the new Meson logo.



The design process was fairly long and somewhat arduous. This blog post aims to provide the rationale for all aspects of the design process.

The shape

From the very beginning it was decided that the logo's shape should be based on the Reuleaux triangle. The overall shape is smooth and pleasant, yet mathematically precise. In spite of its simplicity it can be used for surprisingly complex things. Perhaps the best known example is that if you have a drill bit shaped like a Reuleaux triangle, it can be used to create a rectangular hole. Those who have knowledge about compiler toolchains know that these sort of gymnastics are exactly what a build system needs to do. 

There were tens of different designs that tried to turn this basic idea into a logo. In practice it meant sitting in front of Inkscape and trying different things. These attempts included such things as combining the Reuleaux triangle with other shapes such as circles and triangles, as well as various attempts to build a stylished "M" letter. None of them really clicked until one day I put a smaller triangle upside down inside the other. Something about that shape was immediately captivating.



This shape was not directly usable by itself because it looked a lot like the Sierpinski triangle, which you may know from many corporate logos and other media. People also felt that this shape had an eye or a face that was looking at them, which was distracting. The final breakthrough came when even smaller versions of the triangle were added to the corners of the smaller triangle to produce a roughly propeller-like shape. 

The colour

Meson's original logo was green as it is the colour of growth and nature. This is a common choice for tooling. As an example, Bazel's original logo was a stylised green leaf. It is easy to see why green does not work with the chosen shape:



Legend of Zelda alliterations aside, some other color was needed. I was reading an article about logo colors which mentioned that for some reason purple logos are not as common as one would imagine. Upon reading that sentence the choice was done. After all if it is good enough for Taco Bell and Barbie, it is good enough for us.

Choosing the actual shade of purple was a tougher problem. We wanted a deep shade that is closer to blue than red. A bunch of back and forth got us to the current selection, PANTONE 2105C. This decision might not be final, though. The shade is slightly off from what I considered the perfect one. It would also be nice to have a free-as-in-freedom way of defining the logo colour rather than using one from a proprietary vendor.

The font

Technical logos usually use sans serif fonts, possibly with a geometrical shape to evoke stability and rigidity. This was the original plan for Meson's logo as well but eventually it seemed like those fonts were fairly heavy. We wanted the text to be light to symbolize the lean internals of Meson and its low resource usage. It also turned out that a light text nicely balances the slightly bulky logo shape. Going through various font and design sites I eventually found myself looking at this picture.


This is the inscription in Trajan's Column as created by ancient Romans. The typeface was exactly as needed, light but a definite sense of gravitas. The basic letterforms were traced and after several rounds of artisanal hand-crafted vector object tuning the final text shape was finished. For comparison we also tested Computer Modern as the logo text. It got a very sharp 50/50 split of people either hating it or loving it.

Licenses

The logo is not licensed under Apache as the rest of Meson is. It turns out that there is a license hole in the FOSS world for logos. It would be great if someone could come up with a simple and clear cut "you can use the logo to refer to the project but not claim endorsement of the project or use the logo on any merchandising" license. Many projects, such as GNOME, have their own license for this but copying random license legalese always feels a bit scary.

Final words

If you spend long enough working on any abstract geometric shape, you are going to think of some real world object that resembles it. That's just how the human brain works. This the image my subconscious brought forth.


I swear that this was not an intentional design point.

Friday, November 16, 2018

The performance impact of zeroing raw memory

When you create a new variable (in C, C++ and other languages) or allocate a block of memory the value is undefined. That is, whatever bit pattern happened to be in the raw memory location at the time. This is faster than initialising all memory (which languages such as Java do) but it is also unsafe and can lead to bugs, such as use-after-free issues.

There have been several attempts to change this behaviour and require that compilers would initialize all memory to a known value, usually zero. This is always rejected with a statement like "that would cause a performance degradation fo unknown size" and the issue is dropped. This is not very scientific so let's see if we could get at least some sort of a measurement for this.

The method

The overhead for uninitialized variables is actually fairly difficult to measure. Compilers don't provide a flag to initialize all variables to zero. Thus measuring this would require compiler hacking, which is a ton of work. An alternative would be to write a clang-tidy plugin and add a default initialization to zero for all variables that don't have a initialization clause already. This is also fairly involved, so let's not do this.

The impact of dynamic memory turns out to be fairly straightforward to measure. All we need to do is to build a shared library with custom overrides for malloc, free and memalign, and LD_PRELOAD it to any process we want to measure. The sample code can be found in this Github repo.

Measurements

We did two measurements. The first one was running Python's pystone benchmark. There was no noticeable difference between zero initialization and no initialization.

The second measurement consisted of compiling a simple C++ iostream helloworld application with optimizations enabled. The results for this experiment were a lot more interesting. Zeroing all memory on malloc made the program 2% slower. Zeroing the memory on both allocation and free (to catch use-after-free bugs) made the program 3.6% slower.

A memory zeroing implementation inside malloc would probably have a smaller overhead, because there are cases where you don't need to explicitly overwrite the memory, for example when the allocation is done behind the scenes via mmap/munmap.

Monday, November 12, 2018

Compile any C++ program 10× faster with this one weird trick!

tl/dr: Is it unity builds? Yes.

I would like to know more!

At work I have to compile a large code base from scratch fairly often. One of the components it has is a 3D graphics library. It takes around 2 minutes 15 seconds to compile using an 8 core i7. After a while I got bored with this and converted the system to use a unity build. In all simplicity what that means is that if you have a target consisting of files foo.cpp, bar.cpp, baz.cpp etc you create a cpp file with the following contents:

#include<foo.cpp>
#include<bar.cpp>
#include<baz.cpp>

Then you would tell the build system to build that instead of the individual files. With this method the compile time dropped down to 1m 50s which does not seem like that much of a gain but the compilation used only one CPU core. The remaining 7 are free for other work. If the project had 8 targets of roughly the same size, building them incrementally would take 18 minutes. With unity builds they would take the exact same 1m 50s assuming perfect parallelisation, which happens fairly often in practice.

Wait, what? How is this even?

The main reason that C++ compiles slowly has to do with headers. Merely including a few headers in the standard library brings in tens or hundreds of thousands of lines of code that must be parsed, verified, converted to an AST and codegenerated in every translation unit. This is extremely wasteful especially given that most of that work is not used but is instead thrown away.

With an Unity build every #include is processed only once regardless of how many times it is used in the component source files.

Basically this amounts to a caching problem, which is one of the two really hard problems in computer science in addition to naming things and off by one errors.

Why is this not used by everybody then?

There are several downsides and problems. You can't take any old codebase and compile it as a unity build. The first blocker is that things inside source files leak into other ones since they are all textually included one after the other.. For example if you have two files and each of them declares a static function with the same name, it will lead to name clashes and a compilation failure. Similarly things like using namespace std declarations leak from one file to another causing havoc.

But perhaps the biggest problem is that every recompilation takes the same time. An incremental rebuild where one file has changed takes a few seconds or so whereas a unity builds takes the full 1m 50s every time. This is a major roadblock to iterative development and the main reason unity builds are not widely used.

A possible workflow with Meson

For simplicity let's assume that we have a project that builds and works with unity builds. Meson has an automatic unity build file generator that can be enabled by setting the value of the unity build option.

This solves the basic build problem but not the incremental one. However usually you'd develop only one target (be it a library, executable or module) and want to build only that one incrementally and everything else as a unity build. This can be done by editing the build definition of the target in question and adding an override option:

executable(..., override_options : ['unity=false'])

Once you are done you can remove the override from the build file to return everything back to normal.

How does this tie in with C++ modules?

Directly? Not in any way really. However one of the stated advantages of modules has always been faster build times. There are a few module implementations but there is very little public data on how they behave with real world codebases. During a CppCon presentation on modules Google's Chandler Carruth mentioned that in Google's code base modules resulted in 30% build time reduction.

It was not mentioned whether Google uses unity builds internally but they almost certainly don't (based on things such as this bug report on Bazel). If we assume that theirs is the fastest existing "classical" C++ build mechanism, which it probably is, the conclusion is that it is an order of magnitude slower than a unity build on the same source files. A similar performance gap would probably not be tolerated in any other part of the C++ ecosystem.

The shoemaker's children go barefoot.

Tuesday, November 6, 2018

Simple guide to designing pleasant web sites

When is it ok to...

Use infinite scrolling pagesnever
Steal the "/" key for your search widgetnever
Have a "website app" instead of a good web sitenever
Break functionality on the site to drive app usagenever
Autoplay audionever
Use 500 megs of ram for a web IRC (or equivalent)never
Use 100% CPU for animations you can't disablenever
Use 100% CPU for animations you can disablenever
Provide important information only as PDFnever
Have layouts with more pixels for chrome than contentnever
Run a news aggregator site that opens all links in new windowsnever
Block main page from showing until all ad trackers are loadednever
Autoplay video clipsonly if you are Vimeo, Youtube or a similar site

Saturday, November 3, 2018

Some use cases for shared linking and ABI stability

A recent trend in language design and devops deployment has been to not use shared libraries. Instead every application is rebuilt and statically linked for maximum performance. This is highly convenient in many cases. Some people even go as far as to declare shared linking, and with it any ABI stability, a dead relic of the past that is only unnecessary but actively harmful because maintaining ABI stability slows down language changes and renewal.

This blog post was not written to argue whether this is true or not. Instead it is meant to list many reasons and use cases where shared libraries and ABI stability are useful and which would be hard, or even impossible, to achieve by relying only on static linking.

Many of the issues listed here are written from the perspective of a modern Linux distribution, especially Debian. However I am not a Debian developer so the following is not any sort of an official statement, just my writings as an individual.

Guaranteed update propagation

Debian consists of thousands of packages. Each package's state is managed by a package maintainer. Each manager typically maintains between one and a handful of packages, so there are hundreds of them. Each one of them works in relative isolation from others. That is, they can upload updates to packages at their own pace. In fact, it is an important part of Debian's social structure that no-one can be forced to do any particular task.

On the other hand, Debian is also very strict about security. If a vulnerability is found in, say, a popular encryption library then it must be possible for one single person to update the encryption code in every single package that uses it, even indirectly. With a stable ABI and shared libraries, this can be done easily. Updating the dependency package (and possibly rebooting the machine) guarantees that every package on the system uses the new library. If packages were statically linked, each package would have to be rebuilt and reuploaded. This would require hundreds of people around the world to work in a coordinated fashion. In a volunteer based system this is not possible, especially for cases that require an embargo.

Update server bandwidth savings

The amount of bandwidth it takes to run a Linux distribution mirror is substantive. As we saw above, it is possible to update single packages which make downloads fairly small. If everything was statically linked then every library update would mean downloading the full rebuilt binaries of every affected package. This means a 10x to 100x increase in bandwidth requirements. Distro mirrors are already quite heavily loaded and probably could not handle this sort of increase in traffic.

Download bandwidth savings

Most of the population in the world does not have a direct 10GB Ethernet connection for their personal use. In fact there are many people who only have 2G connection at best and even that is sporadic. There are also many servers that have very poor Internet connections, such as scientific instruments and credit card payment terminals in remote cities. Getting updates to these machines is difficult even now. If update sizes ballooned in size, it might become completely infeasible.

Shipping prebuilt middleware

There are many providers of middleware (such as in computer games) that will only provide their code as prebuilt libraries (usually shared, because they are harder to reverse engineer). They will not and can not ever ship their source code to customers because that contains all their special sauce. This entire business model relies on a stable ABI.

Software certification

I don't know have personal experience about this so the following entry might be completely false. However it is based on best effort information I had. If you have first hand experience and can either confirm or deny this, please post a comment to this article.

In highly regulated business sectors the problem of certification often comes up. Basically what this means is that each executable is put through extensive testing cycle. If it passes then it is certified and can be used in production. Specifically, only that exact binary can be used. Any changes to the code means that the program must be re-certified. This is a time consuming and extremely expensive process.

It may be that the certification cycle is different for the operating system component. Thus applying OS updates provided by the vendor may be faster and cheaper. As long as they maintain ABI stability, the actual program does not need to be changed removing the need to re-certify it.

Extension modules

Suppose you create a program that provides an extension or plugin interface to third party code. Examples include the modding interface of many games and, as an extreme example, the entire Eclipse IDE. Supporting this without needing to provide third party extensions as source (and shipping a compiler with your program) requires a stable ABI.

Low barrier to entry

One of the main downsides of rebuilding everything from source all the time is the amount of resources it takes. For many this is not a problem and when asked about it may even snootily reply with "just buy more machines from AWS".

One of the strong motivations of the free and open source movement has been enablement and empowering. That is, making it as easy as possible for as many people as possible to participate. There are many people in the world whose only computer is an old laptop or possibly even just a Raspberry Pi. In the current model it is possible for take any part of the system and hack on it in isolation (except maybe something like Chromium). If we go to a future where participating in software development requires access to a data center, these people are prevented from contributing.

Supporting slow platforms

One of the main philosophical points of Debian is that every supported architecture must be self hosting. That is, packages for Arm must be built on Arm, Mips packages must be built on Mips and so on. Self hosting is an important goal, because it proves the system works and is self-sustaining in ways that simply using cross built packages does not.

Currently it takes a lot of time to do a full archive rebuild using any of the slower architectures, but it is still feasible. If the amount of work needed to do a full rebuild grows by 10 or 100, it is no longer achievable. Thus the only platforms that could reasonably self-host would be x86, Power, s390x and possibly arm64.

Supporting old binaries

There are many cases where a specific application binary must keep running even though the entire system around it changes. A good example of this are computer and console games. People have paid good money for games on Windows 7 (or Vista, or XP) and they expect them to keep working on Windows 10 as well, even on hardware that did not even exist back when the game was released. The only known solution to this are stable ABIs. The same problem happens with consoles such as PS4. Every single game released during its life cycle must run on all console system software versions released after the game, even without a network connection for downloading updates.

Errata

Since writing this article I have been told that any Developer may request a rebuild and reupload of a binary package and it happens automatically. So it is possible for one person to fix a package and have its dependents rebuilt, but it would still require lots of compute and bandwidth resources.

Sunday, October 14, 2018

Things Microsoft could do to make life of developers easier

A few weeks ago I was at CppCon. One of the presentations was about new stuff in the Visual Studio compiler. The presentation had this slide fairly early on.

Presentation screenshot saying the mission of the C++ team is to make the lives of all C++ developers on the planet better.


If that is truly their goal, then here are some things they could do. (Some not specifically about C++ but still related.)

Proper RPATH support

If you have a project that uses shared libraries and you want to run it directly from the build directory, then you really need to have rpath or something similar to it. A simple way of explaining it is that you add a piece of text inside an executable saying "when running me, search for foo.dll in directory ../baz/lib.

Since this is not natively supported, people need to resort to awful hacks to make it work:
  • adjusting the PATH envvar to contain the dirs where the dlls are (because PATH is used to look up dlls)
  • copy all files to the same directory before running
  • creating a manifest file defining an internal bundle, creating a subdirectory and copying all dependency dlls there
  • static linking everything
  • mandating a project layout where everything is in one subdirectory
All of these are nasty hacks. It should be possible to run programs straight from the build dir without needing to copy anything or change envvars.

During CppCon I was told that with the very latest Windows 10 it should be possible to do this "somehow" but googling for this has not uncovered any instructions.

Spawning a process using an array

The way to spawn processes in Windows is using the CreateProcess function. Note that it takes a command string, not an array. The command implementation will then parse the string to a command array and run it. The documentation page does not document how the parsing is done, but presumably it is the same as what cmd.exe does.

What this means is that it is impossible to spawn a process on Windows without needing to jump through massive quoting hoops. For example suppose you want to write a Ninja file to call a specific command. Because of this problem Ninja does not support arrays natively but instead requires every user to write a single command string, which leads to double quoting. First you need to quote the array to be a Windows process spawning command and then you need to quote that according to Ninja's quoting rules.

And then it gets terrible.

The command line length limitation on Windows is ridiculously short. Even fairly simple link commands are too long. Thus you need to write the actual command to a response file, which the command then reads and parses on its own. Since every program writes their own parsing and splitting code, you may find that you need to quote things differently depending on whether you are using the command line or a response file. You get one guess whether some (but not all) programs coming from Unix parse their response files according to Unix shell rules, even on Windows.

Now there might be a few people out there who just got outraged, because msvcrt does in fact have functions to spawn processes with arrays. They are a complete lie. Here is a rough pseudocode representation on how they are implemented:

def spawn_process(command_array):
    command_line = ' '.join(command_array)
    return CreateProcess(command_line)

Support GCC's destructor extension in plain C

RAII is awesome. It is, in fact, so awesome that GCC ships an extension to use it with plain C. It is used by many plain C projects such as GLib and systemd. I have spoken to many C developers and they really love that feature and they absolutely hate that they can't use it in code that has to support MSVC.

Adding this support would be great and make the world a better place in several ways including:
  • you can use libraries that use this feature as dependencies when building with MSVC
  • multiplatform projects can start using destructors freely
  • all the millions of lines of C code that exist in the world (and which will not be rewritten any time soon) can be made iteratively safer and more reliable
Eventually it would be nice to get this feature in the C standard, but that is unlikely to happen any time soon.

Performance optimize MSBuild

Running the test suite of Meson with the Visual Studio compiler takes roughly 6-7 minutes when using the Ninja backend and 14-18 minutes when using the MSBuild backend. Granted, this is a worst case scenario of running many small independent builds in a row, but it is still frustratingly slow. The same can be found when using Visual Studio IDE. After typing ctrl-shift-b there is usually a noticeable lag until any compilation actually starts.

Kill the need for vcvarsall.bat and provide parallel installable compilers

Visual studio compilers are not in path by default. You have to either start a special shell or run a magic bat file from a magic directory that sets up the environment so that the compilers work. If you go looking in the installed directory there are many different directories all of which contain an executable cl.exe. Which one you run depends on PATH settings, thus you can only run one compiler at a time. This makes it really difficult to, for example, run multiple different VS versions (15, 17, native, cross etc) from a single script.

This same problem has been solved on Unix side ages ago. The trick is to provide many executables with different names. For example cl15-x86.exe, cl17-arm.exe and cl17-x64.exe. Each of these executables would set up the equivalent of vcvarsall.bat for its own process and then forward the actual compilation to the compiler, wherever it may be hidden in the file system hierarchy. These binaries could the be put in one single path location and they could be used from any command prompt, even in parallel. This is particularly useful for cross compilation projects where you need to build a code generator with the native compiler and then use it to generate source code for the cross compiler.

Have you reported these as bugs upstream?

No. Nothing on this blog post is new, these are all issues that have been known for 20+ years and most likely have been reported to Microsoft dozens, if not hundreds of times. The fact that these things have not been fixed is a question of corporate priorities. As a random-non-windows-using-dude-on-internet I don't really have any influence on those.