Saturday, November 3, 2018

Some use cases for shared linking and ABI stability

A recent trend in language design and devops deployment has been to not use shared libraries. Instead every application is rebuilt and statically linked for maximum performance. This is highly convenient in many cases. Some people even go as far as to declare shared linking, and with it any ABI stability, a dead relic of the past that is only unnecessary but actively harmful because maintaining ABI stability slows down language changes and renewal.

This blog post was not written to argue whether this is true or not. Instead it is meant to list many reasons and use cases where shared libraries and ABI stability are useful and which would be hard, or even impossible, to achieve by relying only on static linking.

Many of the issues listed here are written from the perspective of a modern Linux distribution, especially Debian. However I am not a Debian developer so the following is not any sort of an official statement, just my writings as an individual.

Guaranteed update propagation

Debian consists of thousands of packages. Each package's state is managed by a package maintainer. Each manager typically maintains between one and a handful of packages, so there are hundreds of them. Each one of them works in relative isolation from others. That is, they can upload updates to packages at their own pace. In fact, it is an important part of Debian's social structure that no-one can be forced to do any particular task.

On the other hand, Debian is also very strict about security. If a vulnerability is found in, say, a popular encryption library then it must be possible for one single person to update the encryption code in every single package that uses it, even indirectly. With a stable ABI and shared libraries, this can be done easily. Updating the dependency package (and possibly rebooting the machine) guarantees that every package on the system uses the new library. If packages were statically linked, each package would have to be rebuilt and reuploaded. This would require hundreds of people around the world to work in a coordinated fashion. In a volunteer based system this is not possible, especially for cases that require an embargo.

Update server bandwidth savings

The amount of bandwidth it takes to run a Linux distribution mirror is substantive. As we saw above, it is possible to update single packages which make downloads fairly small. If everything was statically linked then every library update would mean downloading the full rebuilt binaries of every affected package. This means a 10x to 100x increase in bandwidth requirements. Distro mirrors are already quite heavily loaded and probably could not handle this sort of increase in traffic.

Download bandwidth savings

Most of the population in the world does not have a direct 10GB Ethernet connection for their personal use. In fact there are many people who only have 2G connection at best and even that is sporadic. There are also many servers that have very poor Internet connections, such as scientific instruments and credit card payment terminals in remote cities. Getting updates to these machines is difficult even now. If update sizes ballooned in size, it might become completely infeasible.

Shipping prebuilt middleware

There are many providers of middleware (such as in computer games) that will only provide their code as prebuilt libraries (usually shared, because they are harder to reverse engineer). They will not and can not ever ship their source code to customers because that contains all their special sauce. This entire business model relies on a stable ABI.

Software certification

I don't know have personal experience about this so the following entry might be completely false. However it is based on best effort information I had. If you have first hand experience and can either confirm or deny this, please post a comment to this article.

In highly regulated business sectors the problem of certification often comes up. Basically what this means is that each executable is put through extensive testing cycle. If it passes then it is certified and can be used in production. Specifically, only that exact binary can be used. Any changes to the code means that the program must be re-certified. This is a time consuming and extremely expensive process.

It may be that the certification cycle is different for the operating system component. Thus applying OS updates provided by the vendor may be faster and cheaper. As long as they maintain ABI stability, the actual program does not need to be changed removing the need to re-certify it.

Extension modules

Suppose you create a program that provides an extension or plugin interface to third party code. Examples include the modding interface of many games and, as an extreme example, the entire Eclipse IDE. Supporting this without needing to provide third party extensions as source (and shipping a compiler with your program) requires a stable ABI.

Low barrier to entry

One of the main downsides of rebuilding everything from source all the time is the amount of resources it takes. For many this is not a problem and when asked about it may even snootily reply with "just buy more machines from AWS".

One of the strong motivations of the free and open source movement has been enablement and empowering. That is, making it as easy as possible for as many people as possible to participate. There are many people in the world whose only computer is an old laptop or possibly even just a Raspberry Pi. In the current model it is possible for take any part of the system and hack on it in isolation (except maybe something like Chromium). If we go to a future where participating in software development requires access to a data center, these people are prevented from contributing.

Supporting slow platforms

One of the main philosophical points of Debian is that every supported architecture must be self hosting. That is, packages for Arm must be built on Arm, Mips packages must be built on Mips and so on. Self hosting is an important goal, because it proves the system works and is self-sustaining in ways that simply using cross built packages does not.

Currently it takes a lot of time to do a full archive rebuild using any of the slower architectures, but it is still feasible. If the amount of work needed to do a full rebuild grows by 10 or 100, it is no longer achievable. Thus the only platforms that could reasonably self-host would be x86, Power, s390x and possibly arm64.

Supporting old binaries

There are many cases where a specific application binary must keep running even though the entire system around it changes. A good example of this are computer and console games. People have paid good money for games on Windows 7 (or Vista, or XP) and they expect them to keep working on Windows 10 as well, even on hardware that did not even exist back when the game was released. The only known solution to this are stable ABIs. The same problem happens with consoles such as PS4. Every single game released during its life cycle must run on all console system software versions released after the game, even without a network connection for downloading updates.

Errata

Since writing this article I have been told that any Developer may request a rebuild and reupload of a binary package and it happens automatically. So it is possible for one person to fix a package and have its dependents rebuilt, but it would still require lots of compute and bandwidth resources.

1 comment:

  1. from my experience with snaps:
    - server & client bandwidth: delta updates only require you to download the parts that have changed. See e.g. zsync.
    - middleware: ABI is not the only possible interface. Think REST/ Dbus
    - old binaries: within a snap there is basically all of the old ABI your need in one package. This is even better as sometimes there are breaking non ABI changes.

    Of course security updates are a nightmare and can easily outweight any of the points above.

    ReplyDelete