After some midsummer hacking Meson now has support for building and uploading Arduino projects. (caveat: this MR needs to land to master first). This is what it looks like.
A sample project to clone has been set up in Github. It currently only supports Linux and Arduino Uno but the repo has documentation on how to adapt it to your configuration.
A gathering of development thoughts of Jussi Pakkanen. Some of you may know him as the creator of the Meson build system.
Friday, June 24, 2016
Monday, May 16, 2016
Performance testing a new parallel zip file implementation
Previously we looked into rewriting a zip file unpacker rather than modernising an existing one as well as measuring its performance. Now we have implemented fully parallel zip file compression so it is time to measure its performance against existing applications. The new implementation is called Parzip and can be obtained from its Github repo.
Test setup
Parzip uses only LZMA compression (though it can uncompress deflated archives). We tested it against the following programs:
- Info-Zip, the default zip/unzip implementation on most Unix platforms
- 7-Zip, the first archiver to use LZMA
- Tar + gzip, the default way of compressing files on Unix
- Tar + xz, is the same as above, but using LZMA compression instead of deflate
The test corpus was a checkout of the Clang compiler, including full source code, SVN directories and a build tree. This amounts to roughly 27 gigabytes of data. The test machine was an i7 machine with 8 cores and a spinning disk running Debian stable. Parzip was self compiled and is used liblzma from Debian repository, all other binaries were their default versions as provided by Debian. All compression settings were left at their default values.
Results
Let's start with the most important measurement: how long did each app take to compress the directory tree.
Info-Zip and gzip both use the deflate algorithm which is a lot faster than LZMA. Here we can see how they are the fastest even though they only use one thread for compression. Parzip is the fastest LZMA compressor by quite a wide margin. Xz is the slowest because it uses only one thread. 7-zip has some threading support which makes it faster than xz, but it loses heavily to Parzip.
Decompression times show a similar trend.
Decompression is a lot faster than compression (this graph is in seconds as opposed to minutes). The interesting point here is that Parzip is much faster than Info-Zip and gzip even though it uses LZMA. Being able to use all cores really helps here.
Finally let's look at sizes of the compressed archives.
Nothing particularly surprising here. LZMA files are roughly the same size with xz being the smallest, 7-zip following close by and Parzip holding the third place. Programs using deflate achieve noticeably worse compression ratio. Info-Zip is the clear loser here, maybe due to it using its own deflate implementation rather than zlib.
Big iron test
We also tested the performance of xz and Parzip on a 48 core heavy duty server.
XZ takes almost three hours to compress the data set while Parzip can achieve the same in just over 20 minutes. This is almost 90% faster. The theoretical best possible speedup would be roughly 97%, but due to overhead and other issues we can't reach those numbers.
Decompression times are similar with Parzip being 93% faster. This is due to decompression parallelising better than compression. Compression needs to write its results in a single file whereas decompression can write its outputs to the file system independent of each other.
Conclusions
Using modern tools and technologies is is possible to create a parallel file compressor that achieves almost the same compression ratio as the current leading technology (tar + xz) while providing almost linear parallelisation on the number of processor cores available.
Caveat
If you plan on doing your own measurements (which you should, it's educational), be careful when analysing the results. If you have a slow hard disk and a lot of decompression threads, they may saturate the disk's write capacity.
Wednesday, May 4, 2016
Jzip parallel unzipper performance results
In my previous blog post I wrote about reimplementing Info-Zip in modern C++ so that we can unpack files in parallel. This sparked a lively Reddit discussion. One of the main points raised was that once this implementation gets support for all other stuff Info-Zip does, then it will be just as large.
Testing this is simple: just implement all missing features. Apart from encryption support (which you should not use, but gpg instead), the work is now finished and available in jzip github repo. Here are the results.
Info-Zip: 82 057 lines of C
jzip: 1091 lines of C++
Info-Zip: 159 kB
Testing this is simple: just implement all missing features. Apart from encryption support (which you should not use, but gpg instead), the work is now finished and available in jzip github repo. Here are the results.
Lines of code (as counted by wc)
Info-Zip: 82 057 lines of C
jzip: 1091 lines of C++
Stripped binary size
Info-Zip: 159 kB
jzip: 51 kB
Performance
Performance was tested by zipping a full Clang source + build tree into one zip file. This includes both the source, svn directories and all build artifacts. The total size was 9.3 gigabytes. Extraction times were as follows.
Info-Zip: 5m 38s
jzip: 2m 47s
Jzip is roughly twice as fast. This is a bit underwhelming result given that the test machine has 8 cores. Further examination showed that the reason for this was that jzip saturates the hard drive write capacity.
Conclusions
Using a few evenings worth of spare time it is possible to reimplement an established (but relatively straightforward) product with two orders of magnitude less code and massively better performance.
Update: more measurements
Usinag a 48 core machine with fast disks.
Info-zip: 3m 32s
jzip: 12s
On this machine jzip is 95% faster.
On the other hand when running on a machine with a slow disk, jzip may be up to 30% slower because of write contention overhead.
Sunday, April 24, 2016
Rewriting from scratch, should you do it?
One thing that has bothered me for a while is that the unzip command line tool only decompresses one file at a time. As a weekend project I wanted to see if I could make it work in parallel. One could write this functionality from scratch but this gave me a possibility to try really look into incremental development.
All developers love writing stuff from scratch rather than fixing and existing solutions (yes, guilty as charged). However the accepted wisdom is that you should never do a from scratch rewrite but instead improve what you have via incremental improvements. Thus I downloaded the sources of Info-Zip and got to work.
Info-Zip's code base is quite peculiar. It predates things such as ANSI C and has support for tons of crazy long dead hardware. MS-DOS ranks among the most recently added platforms. There is a lot of code for 16 bit processors, near and far pointers and all that fun stuff your grandad used to complain about. There are even completely bizarre things such as this snippet:
#ifndef const
# define const const
#endif
The code base contains roughly 80 000 lines of K&R C. This should prove an interesting challenge. Those wanting to play along can get the code from Github.
Compiling the code turned out to be quite simple. There is no configure script or the like, everything is #ifdeffed inside the source files. You just compile them into an app and then you have a working exe. The downside is that the source has more preprocessor code than actual code (only a slight exaggaration).
Originally the code used a single (huge) global struct that houses everything. At some point the developers needed to make the code reentrant. Usually this means changing every function to take the global state struct as a function argument instead. These people chose not to do this. Instead they created a C preprocessor macro system that can be used to pass the struct as an argument but also compile the code so it has the old style global struct. I have no idea why they did that. The only explanation that makes any sort of sense is that adding the pointer to stack on every function call is too expensive on 16 bit and smaller platforms. This is just speculation, though, but if anyone knows for sure please let me know.
This meant that every single function definition was a weird concoction of preprocessor macros and K&R syntax. For details see this commit that eventually killed it.
Getting rid of all the cruft was not particularly difficult, only tedious. The original developers were very pedantic about flagging their #if/#endif pairs so killing dead code was straightforward. The downside was that what remained after that was awful. The code had more asterisks than letters. A typical function was hundreds of lines long. Some preprocessor symbols were defined in opposite ways in different header files but things worked because some other preprocessor clauses kept all but one from being evaluated (the code confused Eclipse's syntax highlighter so it's really hard to see what was really happening).
Ten or so hours of solid work later most dead cruft was deleted and the code base had shrunk to 30 000 lines of code. At this point looking into adding threading was starting to become feasible. After going through the code that iterates the zip index and extracts files it became a lot less feasible. As an example the inflate function was not isolated from the rest of the code. All its arguments were given in The One Big Struct and it fiddled with it constantly. Those would need to be fully separated to make anything work.
All developers love writing stuff from scratch rather than fixing and existing solutions (yes, guilty as charged). However the accepted wisdom is that you should never do a from scratch rewrite but instead improve what you have via incremental improvements. Thus I downloaded the sources of Info-Zip and got to work.
Info-Zip's code base is quite peculiar. It predates things such as ANSI C and has support for tons of crazy long dead hardware. MS-DOS ranks among the most recently added platforms. There is a lot of code for 16 bit processors, near and far pointers and all that fun stuff your grandad used to complain about. There are even completely bizarre things such as this snippet:
#ifndef const
# define const const
#endif
The code base contains roughly 80 000 lines of K&R C. This should prove an interesting challenge. Those wanting to play along can get the code from Github.
Compiling the code turned out to be quite simple. There is no configure script or the like, everything is #ifdeffed inside the source files. You just compile them into an app and then you have a working exe. The downside is that the source has more preprocessor code than actual code (only a slight exaggaration).
Originally the code used a single (huge) global struct that houses everything. At some point the developers needed to make the code reentrant. Usually this means changing every function to take the global state struct as a function argument instead. These people chose not to do this. Instead they created a C preprocessor macro system that can be used to pass the struct as an argument but also compile the code so it has the old style global struct. I have no idea why they did that. The only explanation that makes any sort of sense is that adding the pointer to stack on every function call is too expensive on 16 bit and smaller platforms. This is just speculation, though, but if anyone knows for sure please let me know.
This meant that every single function definition was a weird concoction of preprocessor macros and K&R syntax. For details see this commit that eventually killed it.
Getting rid of all the cruft was not particularly difficult, only tedious. The original developers were very pedantic about flagging their #if/#endif pairs so killing dead code was straightforward. The downside was that what remained after that was awful. The code had more asterisks than letters. A typical function was hundreds of lines long. Some preprocessor symbols were defined in opposite ways in different header files but things worked because some other preprocessor clauses kept all but one from being evaluated (the code confused Eclipse's syntax highlighter so it's really hard to see what was really happening).
Ten or so hours of solid work later most dead cruft was deleted and the code base had shrunk to 30 000 lines of code. At this point looking into adding threading was starting to become feasible. After going through the code that iterates the zip index and extracts files it became a lot less feasible. As an example the inflate function was not isolated from the rest of the code. All its arguments were given in The One Big Struct and it fiddled with it constantly. Those would need to be fully separated to make anything work.
That nagging sound in your ear
While fixing the code I kept hearing the call of the rewrite siren. Just rewrite from scratch, it would say. It's a lot less work. Go on! Just try it! You know you want to!
Eventually the lure got too strong so I opened the Wikipedia page on Zip file format. Three hours and 373 lines of C++ later I had a parallel unzipper written from scratch. Granted it does not do advanced stuff like encryption, ZIP64 or creating subdirectories for files that it writes. But it works! Code is available in this repo.
Even better, adding multithreading took one commit with 22 additions and 7 deletions. The build definition is 10 lines of Meson instead of 1000+ lines of incomprehensible Make.
There really is no reason, business or otherwise, to modernise the codebase of Info-Zip. With contemporary tools, libraries and methodologies you can create code that is an order of magnitude simpler, clearer, more maintainable and just all around more pleasant to work with than existing code. In a fraction of the time.
Sometimes rewriting from scratch is the correct thing to do.
This is the exact opposite of what I set out to prove but that's research for you.
Update
The program in question has now been expanded to do full Zip packing + unpacking. See here for benchmarks.
Tuesday, April 19, 2016
A look into Linux software deployment
One of the great virtues of Linux distributions is that they create an easy way to install massive amounts of software. They provide an easy way to install a massive amount of software. They also provide automatic updates so the end user does not have to care. This mechanism is considered by many to be the best way to provide software for end users.
But is it really?
Discussions on this topic usually stir strong emotions, but let's try to examine this issue through practical use cases.
Suppose a developer releases a new game. He wants to have this game available for all Linux users immediately and, conversely, all Linux users want to play his game as soon as possible. Further, let's set a few more reasonable requirements for this use case:
These fairly reasonable requirements are easy to fulfill on all common platforms including Windows, Android, iOS and OSX. However on Linux this is not possible.
The delay between an application's release and having it available in distros is usually huge. The smallest time between releases is on Ubuntu at once every six months. That means that, on average, it takes three months from release into being usable. And this is assuming that the app is already in Debian/Ubuntu, gets packaged immediately and so on.
Some of you might be scoffing at this point. You might think that you just have to time the releases to right before the distro freezes and that this is not a problem. But it is. Even if you time your releases to Ubuntu (as opposed to, say, Fedora), the final freeze of Ubuntu causes complications. During freeze new packages are not allowed in the archive any more. This takes a month or more. More importantly, forcing app developers to time their releases on distro cadences is not workable, because app developers have their own schedules, deadlines and priorities.
There are many other reasons why these sorts of delays are a bad. Rather than go through them one by one, let's do a thought experiment instead. Suppose Steve Jobs were still alive and was holding a press conference on the next release of iOS. He finishes it up with this:
- "The next version of iOS will be released January. All applications that want to be in the app store must be submitted in full before the end of December. If you miss the deadline the next chance to add an app to the store will be in two years. One more thing: once the release ships you can't push updates to your apps. Enjoy the rest of the conference."
Even Distortion Field Steve himself could not pull that one off. He would be laughed off the stage. Yet this is the way the Linux community expects application deployment to be done.
Go ahead and do a rough evaluation. I'll wait.
The answer is obvious: it will never happen! There are two major reasons. First of all, the Debian project simply does not have enough resources to review tens of thousands of new packages. The second reason is that even if they did, no-one wants to review 20 different fart apps for DFSG violations.
Now you might say that this is a good thing and that nobody really needs 20 fart apps. This might be true but this limitation affects all other apps as well. If apps should be provided by distros then the people doing application review become a filter between application developers and end users. The only apps that have a chance of being accepted are those some reviewer has a personal interest in. All others get dropped to the sidelines. To an application developer this can be massively demoralizing. Why on earth should he fulfill the whims of a random third party just to be able to provide software to his end users?
There is also a wonderful analogy to censorship here, but I'll let you work that one out yourselves.
curl http://myawesomejsframework.com/installer.sh | sudo sh
This is, of course, horrible. Any person with even rudimentary knowledge of operating systems, security or related fields will recoil in horror when seeing unverified scripts being run with root privileges. And then they are told that this is actually going out of style and you should instead pull the app as a ready to run Docker container. That you just run. On your production servers. Without any checksumming or verification.
This is how you drive people into depression and liver damage.
After the initial shock passes the obvious followup question is why. Why on earth would people do this instead of using proven good distro packages? Usually this is accompanied by several words that start with the letter f.
If you stop and think about the issue objectively, the reason to do this is obvious.
Distro packages are not solving the problems people have.
This is so simple and unambiguous that people have a hard time grasping it. So here it is again, but this time written in bold:
Distro packages are not solving the problems people have!
Typical web app deployment happens on Debian stable, which gets released once every few years. This is really nice and cool, but the problem is that the web moves a lot faster. Debian's package versions are frozen upon release, however deploying a web app on a year old framework (let alone two or three) is not feasible. It is just too old.
The follow up to this is that you should instead "build your own deb packages for the framework". This does not work either. Usually newer versions of frameworks require newer versions of their dependencies. Those might require newer versions of their dependencies and so on. This is a lot of work, not to mention that the more system packages you replace the bigger risk you run of breaking some part of the base distro.
If you examine this problem with a wider perspective you find that these non-distro package ways of installing software exist for a reason. All the shell scripts that run as root, internal package managers, xdg-app, statically linked binaries, Docker images, Snappy packages, OSX app bundles, SteamOS games, Windows executables, Android and iOS apps and so on exist for only one reason:
They provide dependencies that are not available on the base system.
If the app does not provide the dependencies itself, it will not work.
Fact #1: There are two systems at play: the operating system and the apps on top of it. These have massively different change rates. Distros change every few years. Apps and web sites change daily. Those changes must be deployable to users immediately.
Fact #2: The requirements and dependencies of apps may conflict with the dependencies of the platform. Enforcing a single version of a dependency across the entire system is not possible.
If you take these facts and follow them to their logical conclusion, we find that putting distro core packages and third party applications in the same namespace, which is what mandating the use of distro packages for everything does, can not work.
Instead what we need to accept is that any mechanism of app delivery must consist of two isolated parts. The first is the platform, which can be created with distro packages just like now. The second part is the app delivery on top. App and framework developers need to be able to provide self contained packages directly to end users. Requiring approval of any sort from an unrelated team of volunteers is a failure. This is basically what other operating systems and app stores have done for decades. This approach has its own share of problems, the biggest of which is apps embedding unsafe versions of libraries such as OpenSSL. This is a solvable problem, but since this blog posting is already getting to be a bit too long I refer you to the link at the end of this post for the technical details.
Switching away from the current Linux packaging model is a big change. Some of it has already started happening in the form of initiatives such as xdg-app, Snappy and Appimage. Like any change this is going to be painful for some, especially distributions whose role is going to diminish. The psychological change has already taken hold from web frameworks (some outright forbid distros from packaging them) to Steam games. Direct deployment seems to be where things are headed. When the world around you changes you can do one of two things.
Adapt or perish.
But is it really?
Discussions on this topic usually stir strong emotions, but let's try to examine this issue through practical use cases.
Suppose a developer releases a new game. He wants to have this game available for all Linux users immediately and, conversely, all Linux users want to play his game as soon as possible. Further, let's set a few more reasonable requirements for this use case:
- The game package should be runnable immediately after the developer makes a release.
- The end user should be able to install and play the game without requiring root privileges.
- The end user should not be required to compile any source code.
- The developer should be able to create only one binary package that will run on all distros.
These fairly reasonable requirements are easy to fulfill on all common platforms including Windows, Android, iOS and OSX. However on Linux this is not possible.
The delay between an application's release and having it available in distros is usually huge. The smallest time between releases is on Ubuntu at once every six months. That means that, on average, it takes three months from release into being usable. And this is assuming that the app is already in Debian/Ubuntu, gets packaged immediately and so on.
Some of you might be scoffing at this point. You might think that you just have to time the releases to right before the distro freezes and that this is not a problem. But it is. Even if you time your releases to Ubuntu (as opposed to, say, Fedora), the final freeze of Ubuntu causes complications. During freeze new packages are not allowed in the archive any more. This takes a month or more. More importantly, forcing app developers to time their releases on distro cadences is not workable, because app developers have their own schedules, deadlines and priorities.
There are many other reasons why these sorts of delays are a bad. Rather than go through them one by one, let's do a thought experiment instead. Suppose Steve Jobs were still alive and was holding a press conference on the next release of iOS. He finishes it up with this:
- "The next version of iOS will be released January. All applications that want to be in the app store must be submitted in full before the end of December. If you miss the deadline the next chance to add an app to the store will be in two years. One more thing: once the release ships you can't push updates to your apps. Enjoy the rest of the conference."
Even Distortion Field Steve himself could not pull that one off. He would be laughed off the stage. Yet this is the way the Linux community expects application deployment to be done.
A closer look at the numbers
Let's do a different kind of a thought experiment. Suppose I have a magic wand. I wave it once, which causes every application that is in the Android and iOS app stores to become free software. I wave it a second time and every one of those apps grows a Linux port with full Debian packaging and is then submitted to Debian. How long will it take until they are all processed and available in the archive?Go ahead and do a rough evaluation. I'll wait.
The answer is obvious: it will never happen! There are two major reasons. First of all, the Debian project simply does not have enough resources to review tens of thousands of new packages. The second reason is that even if they did, no-one wants to review 20 different fart apps for DFSG violations.
Now you might say that this is a good thing and that nobody really needs 20 fart apps. This might be true but this limitation affects all other apps as well. If apps should be provided by distros then the people doing application review become a filter between application developers and end users. The only apps that have a chance of being accepted are those some reviewer has a personal interest in. All others get dropped to the sidelines. To an application developer this can be massively demoralizing. Why on earth should he fulfill the whims of a random third party just to be able to provide software to his end users?
There is also a wonderful analogy to censorship here, but I'll let you work that one out yourselves.
Javascript frameworks: same problem, different angle
Let's leave applications for a while and look at something completely different: web frameworks. There are several hundred of them and what most seem to have in common is that they don't come in distro packages. Instead you install them like this:curl http://myawesomejsframework.com/installer.sh | sudo sh
This is, of course, horrible. Any person with even rudimentary knowledge of operating systems, security or related fields will recoil in horror when seeing unverified scripts being run with root privileges. And then they are told that this is actually going out of style and you should instead pull the app as a ready to run Docker container. That you just run. On your production servers. Without any checksumming or verification.
This is how you drive people into depression and liver damage.
After the initial shock passes the obvious followup question is why. Why on earth would people do this instead of using proven good distro packages? Usually this is accompanied by several words that start with the letter f.
If you stop and think about the issue objectively, the reason to do this is obvious.
Distro packages are not solving the problems people have.
This is so simple and unambiguous that people have a hard time grasping it. So here it is again, but this time written in bold:
Distro packages are not solving the problems people have!
Typical web app deployment happens on Debian stable, which gets released once every few years. This is really nice and cool, but the problem is that the web moves a lot faster. Debian's package versions are frozen upon release, however deploying a web app on a year old framework (let alone two or three) is not feasible. It is just too old.
The follow up to this is that you should instead "build your own deb packages for the framework". This does not work either. Usually newer versions of frameworks require newer versions of their dependencies. Those might require newer versions of their dependencies and so on. This is a lot of work, not to mention that the more system packages you replace the bigger risk you run of breaking some part of the base distro.
If you examine this problem with a wider perspective you find that these non-distro package ways of installing software exist for a reason. All the shell scripts that run as root, internal package managers, xdg-app, statically linked binaries, Docker images, Snappy packages, OSX app bundles, SteamOS games, Windows executables, Android and iOS apps and so on exist for only one reason:
They provide dependencies that are not available on the base system.
If the app does not provide the dependencies itself, it will not work.
But what does it really mean?
A wise man once said that the all wisdom begins by acknowledging the facts. Let's go through what we have learned thus far.Fact #1: There are two systems at play: the operating system and the apps on top of it. These have massively different change rates. Distros change every few years. Apps and web sites change daily. Those changes must be deployable to users immediately.
Fact #2: The requirements and dependencies of apps may conflict with the dependencies of the platform. Enforcing a single version of a dependency across the entire system is not possible.
If you take these facts and follow them to their logical conclusion, we find that putting distro core packages and third party applications in the same namespace, which is what mandating the use of distro packages for everything does, can not work.
Instead what we need to accept is that any mechanism of app delivery must consist of two isolated parts. The first is the platform, which can be created with distro packages just like now. The second part is the app delivery on top. App and framework developers need to be able to provide self contained packages directly to end users. Requiring approval of any sort from an unrelated team of volunteers is a failure. This is basically what other operating systems and app stores have done for decades. This approach has its own share of problems, the biggest of which is apps embedding unsafe versions of libraries such as OpenSSL. This is a solvable problem, but since this blog posting is already getting to be a bit too long I refer you to the link at the end of this post for the technical details.
Switching away from the current Linux packaging model is a big change. Some of it has already started happening in the form of initiatives such as xdg-app, Snappy and Appimage. Like any change this is going to be painful for some, especially distributions whose role is going to diminish. The psychological change has already taken hold from web frameworks (some outright forbid distros from packaging them) to Steam games. Direct deployment seems to be where things are headed. When the world around you changes you can do one of two things.
Adapt or perish.
Closing words
If you are interested in the mechanics of building app packages and providing security and dependencies for them watch this presentation on the subject I held at LCA 2016.Sunday, April 10, 2016
Testing performance of build optimisations
This blog post examines the performance effect of various compiler options. The actual test consists of compressing a 270 megabyte unpacked tar file with zlib. For training data of profile guided optimisation we used a different 5 megabyte tar file.
All measurements were done five times and the fastest time of each run was chosen. Both the zlib library and the code using it is compiled from scratch. For this we used the Wrap dependency system of the Meson build system. This should make the test portable to all platforms (we tested only Linux + GCC). The test code is available on Github.
We used two different machines for testing. The first one is a desktop machine with Intel i7 running latest Ubuntu and GCC 5.2.1. The other machine is a Raspberry Pi 2 with Raspbian and GCC 4.9.2. All tests were run with basic optimization (-O2) and release optimization (-O3).
Here are the results for the desktop machine. Note the vertical scale does not go down to zero to make the differences more visible. The first measurement uses a shared library for zlib, all others use static linking.
Overall the differences are quite small, the difference between the fastest and slowest time is roughly 4%. Other things to note:
The performance difference between the fastest and slowest options is 5%, which is roughly the same as on the desktop. On this platform pgo also has a bigger impact than lto. However here lto has a noticeable performance impact (though it is only 1%) as opposed to the desktop, where it was basically unmeasurable.
All measurements were done five times and the fastest time of each run was chosen. Both the zlib library and the code using it is compiled from scratch. For this we used the Wrap dependency system of the Meson build system. This should make the test portable to all platforms (we tested only Linux + GCC). The test code is available on Github.
We used two different machines for testing. The first one is a desktop machine with Intel i7 running latest Ubuntu and GCC 5.2.1. The other machine is a Raspberry Pi 2 with Raspbian and GCC 4.9.2. All tests were run with basic optimization (-O2) and release optimization (-O3).
Here are the results for the desktop machine. Note the vertical scale does not go down to zero to make the differences more visible. The first measurement uses a shared library for zlib, all others use static linking.
Overall the differences are quite small, the difference between the fastest and slowest time is roughly 4%. Other things to note:
- static libraries are noticeably faster than shared ones
- -O2 and -O3 do not have a clear performance order, sometimes one is faster, sometimes the other
- pgo seems to have a bigger performance impact than lto
The performance difference between the fastest and slowest options is 5%, which is roughly the same as on the desktop. On this platform pgo also has a bigger impact than lto. However here lto has a noticeable performance impact (though it is only 1%) as opposed to the desktop, where it was basically unmeasurable.
Sunday, April 3, 2016
Simple new features of the Meson build system
One of the main design goals of Meson has been to make the 95% use case as simple as possible. This means that rather than providing a framework that people can use to solve their problems themselves, we'd rather just provide the solution.
For a simple example let's look at using the address sanitizer. On other build systems you usually need to copy a macro that someone else has written into your build definition or write a toggle that adds the compiler flags manually. This is not a lot of work but everyone needs to do it and these things add up. Different projects might write different options for this feature so when switching between projects you always need to learn and remember what the option was called on each.
At its core this really is a one bit issue. Either you want to use address sanitizer or you don't. With this in mind, Meson provides an interface that is just as simple. You can either enable address sanitizer during build tree configuration:
meson -Db_sanitizer=address <source_dir> <build_dir>
Or you can configure it afterwards with the configuration tool:
mesonconf -Db_sanitizer=address <build_dir>
There are a bunch of other options that you can also enable with these commands (for the remainder of the document we only use the latter).
Link-time optimization:
mesonconf -Db_lto=true <build_dir>
Warnings as errors:
mesonconf -Dwerror=true <build_dir>
Coverage results:
mesonconf -Db_coverage=true <build_dir>
Profile guided optimization:
mesonconf -Dd_pgo=generate (or =use) <build_dir>
Compiler warning levels (1 is -Wall, 3 is almost everything):
mesonconf -Dwarning_level=2 <build_dir>
The basic goal of this option system is clear: you, the user, should only describe what you want to happen. It is the headache of the build system to make it happen.
For a simple example let's look at using the address sanitizer. On other build systems you usually need to copy a macro that someone else has written into your build definition or write a toggle that adds the compiler flags manually. This is not a lot of work but everyone needs to do it and these things add up. Different projects might write different options for this feature so when switching between projects you always need to learn and remember what the option was called on each.
At its core this really is a one bit issue. Either you want to use address sanitizer or you don't. With this in mind, Meson provides an interface that is just as simple. You can either enable address sanitizer during build tree configuration:
meson -Db_sanitizer=address <source_dir> <build_dir>
Or you can configure it afterwards with the configuration tool:
mesonconf -Db_sanitizer=address <build_dir>
There are a bunch of other options that you can also enable with these commands (for the remainder of the document we only use the latter).
Link-time optimization:
mesonconf -Db_lto=true <build_dir>
Warnings as errors:
mesonconf -Dwerror=true <build_dir>
Coverage results:
mesonconf -Db_coverage=true <build_dir>
Profile guided optimization:
mesonconf -Dd_pgo=generate (or =use) <build_dir>
Compiler warning levels (1 is -Wall, 3 is almost everything):
mesonconf -Dwarning_level=2 <build_dir>
The basic goal of this option system is clear: you, the user, should only describe what you want to happen. It is the headache of the build system to make it happen.
Subscribe to:
Posts (Atom)







