Tuesday, November 8, 2016

Building code faster and why recursive Make is so slow

One of the most common reactions to Meson we have gotten has been that build time is dominated by the compiler, thus trying to make the build system faster is pointless. After all, Make just launches subproject processes and waits for them to finish, so there's nothing to be gained.

This is a reasonable assumption but also false. Even if we ignore the most obvious slow parts, such as Configure that can take up to 15 minutes to run on an embedded device to Libtool, which slows everything down for no good reason, there are still gains to be had.

Meson uses Ninja as its main backend. One of the (many) cool things about it is that it is being actively developed and maintained. A recent addition is the Ninja tracer framework. It takes the output of Ninja's log file and converts it to the timing format understood by Chrome's developer tools. It creates output that looks like this (thanks to Nirbheek for gathering the data).

This is a trace of building all of GStreamer using their new Meson based aggregate repo called gst-build. This build has been done using six parallel processes making up the six horizontal lanes in the diagram. Each colored square represents one build task such as compiling or linking with time increasing from left to right. This picture shows us both why Meson is so efficient and also why Make based builds are slow (though the latter requires some analysis).

The main reason for speed is that Ninja keeps all processor cores working all the time. It can do that because it has a global view of the problem. As an example if we build two targets A and B and B links to A, Ninja can start compiling the source code of B before it has finished linking A. This allows it to keep all cores pegged.

The perceptive reader might have noticed the blank space at the left end of the diagram. During that time only one core is doing anything, all others are idle. The process that is running is the GIR tool processing the gstreamer-1.0 library, which all other subprojects depend on. In theory we should be able to run compiles on other projects but for reliability reasons Meson assumes that some source might include the output of the tool as a header so it does not start compilations until the file is generated. This may seem weird, but there are projects that do these kinds of things and require that they work.

The white gap is also what causes Make to be so slow. Most projects write recursive makefiles because maintaining a non-recursive makefile for even a moderate sized project is a nightmare. When building recursively Make goes into each subdirectory in turn, builds it to completion and only then goes to the next one. The last step in almost every case is linking, which can only be done using one process. If we assume that linking takes one second, then for a project that has 20 subdirectories that adds up to 20 wasted seconds of wall time. For an n core machine it corresponds to (n-1)*(num_directories) seconds of CPU time. Those things add up pretty quickly. In addition to Make, the same issue crops up in Visual Studio project files and pretty much any build system that does not have a global view of the dependency tree.

Those interested in going through the original data can do so. Just open this link with either Chromium or Chrome and the chart will be displayed in devtools. Unfortunately this won't work with Firefox.

Bonus challenge

The output format that Chrome's dev tools understands is straightforward. It would be interesting if someone modified Make to output the same format and ran it on a moderate sized project using Make. GStreamer is an obvious candidate as it has both Meson and Autotools setups in master, though gst-build only works with Meson.

No comments:

Post a Comment