Tuesday, July 7, 2020

What if? Revision control systems did not have merge

A fun design exercise is to take an established system or process and introduce some major change into it, such as adding a completely new constraint. Then take this new state of things, run with it and see what happens. In this case let's see how one might design a revision control system where merging is prohibited. Or, formulated in a slightly different way:
What if merging is to revision control systems as multiple inheritance is to software design?

What is merging used for?

First we need to understand what merging is used for so that wa can develop some sort of a system that achieves the same results via some other mechanism. There are many reasons to use merges, but the most popular ones include the following.

An isolated workspace for big changes

Most changes are simple and consists of only one commit. Sometimes, however, it is necessary to make big changes with intermediate steps, such as doing major refactoring operations. These are almost always done in a branch and then brought in to trunk. This is especially convenient if multiple people work on the change.

Trunk commits are always clean

Bringing big changes in via merges means that trunk is always clean and buildable. More importantly bisection works reliably since all commits in trunk are known good. This is typically enforced via a gating CI. This allows big changes to have intermediate steps that are useful but broken in some way so they would not pass CI. This is not common, but happens often enough to be useful.

An alternative to merging is squashing the branch into a single commit. This is suboptimal as it destroys information breaking for example git blame -kind of functionality as all changes made point to a single commt made by a single person (or possibly a bot).

Fix tracking

There are several systems that do automatic tracking of bug fixes to releases. The way this is done is that a fix is written in its own branch. The bug tracking system can then easily see when the fix gets to the various release branches by seeing when the bugfix branch has been merged to them.

A more linear design

In practice many (possibly even most) projects already behave like this. They keep their histories linear by rebasing, squashing and cherry picking, never merging. This works but has the downsides mentioned above. If one spends some time thinking about this problem the fundamental disconnect comes fairly clear. A "linear" revision control system has only one type of a change which is the commit whereas "real world" problems have two different types: logical changes and individual commits that make up the logical change. This structure is implicit in the graph of merge-based systems, but what if we made it explicit? Thus if we have a commit graph that looks like this:



the linear version could look like this:


The two commits from the right branch have become one logical commit in the flat version. If the revision control system has a native understanding of these kinds of physical and logical commits all the problematic cases listed could be made to work transparently. For example bisection would work by treating all logical commits as only one change. Only after it has proven that the error occurred inside a single logical commit would bisection look inside it.

This, by itself, does not fix bug tracing. As there are no merges you can't know which branches have which fixes. This can be solved by giving each change (both physical and logical) a logical ID which remains the same over rebase and edit operations as opposed to the checksum-based commit ID which changes every time the commit is edited. This changes the tracking question from "which release branches have merged this feature fix branch" to "which release branches have a commit with this given logical ID" which is a fairly simple problem to solve.

This approach is not new. LibreOffice has tooling on top of Git that does roughly the same thing as discussed here. It is implemented as freeform text in commit messages with all the advantages and disadvantages that brings.

One obvious question that comes up is could you have logical commits inside logical commits. This seems like an obvious can of worms. On one hand it would be mathematically symmetrical and all that but on the other hand it has the potential to devolve into full Inception, which you usually want to avoid. You'd probably want to start by prohibiting that and potentially permitting it later once you have more usage experience and user feedback.

Could this actually work?

Maybe. But the real question is probably "could a system like this replace Git" because that is what people are using. This is trickier. A key question would whether you can automatically convert existing Git repos to the new format with no or minimal loss of history. Simple merges could maybe be converted in this way but in practice things are a lot more difficult due to things like octopus merges. If the conversion can not be done, then the expected market share is roughly 0%.

Wednesday, July 1, 2020

What is best in open source projects?

Open source project maintainers have a reputation of being grumpy and somewhat rude at times. This is a not unexpected as managing an open source project can be a tiring experience. This can lead to exhaustion and thus to sometimes being a bit too blunt.

But let's not talk about that now.

Instead, let's talk about the best of times, the positive outcomes, the things that really make you happy to be running an open source project. Patches, both bug fixes and new features are like this. So is learning about all the places people are using your project. Even better if they are using it ways you could not even imagine when you started. All of these things are great, but they are not the best.

The greatest thing is when people you have never met or even heard of before come to your project and then on their own initiative take on leadership in some subsection in the project.

The obvious thing to do is writing code, but this also covers things like running web sites, proofreading documentation, wrangling with CI, and even helping other projects to start using your project. At this point I'd like to personally list all the people who have contributed to Meson in this way but it would not be fair as I'd probably miss out some names. More importantly this is not really limited to any single project. Thus, I'd like to send out the following message to everyone who has ever taken ownership of any part of an open source project:

Keep on rocking. You people are awesome!

Tuesday, June 16, 2020

The human brain comprehends atoms but it does not comprehend electrons

The most common thing people know about software projects is that they fail. They might get delayed for months and years. They might be ten times more expensive than expected. They might be completely unusable pieces of garbage. And so on. Over the past 50 or so years many theories have been presented on why that is and how things could be made better. A lot of progress has been made, but no fundamental breakthrough has been made. Software projects still fail fairly often for unexpected reasons.

Typically this implies that there is some fundamental issue (or, more likely, several of them) causing these problems. Since nobody really knows what the real cause is, we are free to wildly speculate and hypothesize that the problem lies somewhere in the human brain. Not any single person's brain, mind you, but fundamental properties of the brain itself.

What the brain does incredibly well

The human brain is an incredible piece of work. Millions of years of evolution has adapted it to be able to be able to do astonishing things in the blink of an eye. The fact that you can walk and talk at the same time without needing to use any conscious effort on it is nothing short of amazing. As a concrete example, look at the following picture for a couple of seconds and then continue reading.


Just a quick glimpse should give you a fairly good idea about the quality of these two very different vehicles. These include things like top speed, acceleration, insurance costs, ride comfort, perceived increase in sex appeal, fuel consumption, maintenance effort, reliability, how long they are expected to keep working and so on. The estimates you get are probably not 100% accurate, but altogether they are probably mostly correct. Given your current life situation you probably also immediately "know" which one of these two would be better for you or that neither is suitable and that you need something completely different, such as a van or a bicycle.

This is your brain doing the thing it does best: examining physical objects in the real world. This works because things that are made of atoms have noticeable and persistent properties. These include things like mass, durability, sharpness and so on. This consistency allows one to fairly accurately estimate their behaviour even in unexpected situations.

And where it stumbles

If you ever find yourself bored at a party, try the following. Find some people, explain the Monty Hall Problem to them and give the correct answer. Then spend the rest of the evening watching how people will argue until they are blue in the face that you are wrong and that the probability is 50/50 and changing the door is not beneficial. This will happen even if (and especially when) participants have higher education degrees in STEM fields.

The Monty Hall Problem is a great example of the deficiencies of our brain. The actual problem is simple but the solution is so counter-intuitive that many people will refuse to accept it and will instead spend massive amounts of time and energy to disprove the point (in vain) because it just "does not feel right". In mathematics this happens fairly often. The reason this is relevant to the discussion at hand is that, at the end, that is what all software really is: applied mathematics. It does not obey the rules of the physical world and atoms, it only exists in the realm of logic. Computers don't deal with atoms, only electrons[1]. To see how this translates to coding, let's do a thought experiment.

Suppose your job is to paint a wall. After a hard day of painting you have ten meters of freshly painted wall. The next day you get to work and continue painting. At the end of the day you look at your work: exactly ten meters wall covered in paint. Puzzled you check everything but can't find anything wrong. You should have 20 meters of painted wall but don't. The next day the same thing happens again. This makes your manager unhappy so he appoints three people to help. The next day the four of you start painting, working as hard and as effectively as you possibly can. When the day comes to an end, you have seven meters of painted wall. Somehow three meters of paint you had already spplied have disappeared and nobody can explain why or even where the paint went. Determined to make up for the loss everyone works extra hard and at the end of the day, finally, nine meters of the wall is properly covered in paint. Everyone goes home exhausted but happy to have finally made progress. The next morning it is discovered that both the paint and the wall it was applied to have disappeared.

Finally, software

This is the essence of creating software. Since it exists purely in the realm of logic and mathematics, you can't really see it, smell it, touch it or feel it. It has failure modes unfathomable with physical processes and these problems occur fairly regularly. This means that all the builtin functionality of your brain is useless in evaluating the outcome and quality of a software project. Code can only experienced through thinking, which is slow, difficult and error prone. As opposed to cars, comparing two different code bases for quality and suitability is a big undertaking. But it gets worse.

The output of code, like the web browser you are using to read this blog post, look and feel a lot like physical objects made of atoms. This means that when people who have no personal experience in programming either buy or manage software projects, they are going to do it as if they were dealing with physical real world objects. That is what they are trained to do and have years of experience in after all. It is also actively harmful, because software is electrons and, as such, beyond the immediate instictive grasp of the human brain. It does not play by the rules of atoms and trying to make it do so will only lead to failure. 

[1] For the physicists among you, yes, technically this should be electric fields, not electrons. This is an artistic decision to make the headline more clickbaity.

Monday, June 8, 2020

Short term usability is not the same as long term usability

When designing almost any piece of code or other functionality, the issue of usability often comes up. Most of the time these discussions go well, but at other times fairly big schisms appear. It may even be that two people advocate for exactly opposite approaches, both claiming that their approach is better because of "usability". This seems like a paradox but upon closer examination we find that it's not for the simple reason that there is no one thing called usability. What is usable or not depends on a ton of different things.

Short term: maximal flexibility and power

As an example let's imagine a software tool that does something. What exactly it does is not important, but let's assume that it needs to be configurable and customizable. A classical approach to this is to create a small core and then add a configuration language for end users. The language may be Turing complete either by design or it might grow into Turing completeness accidentally when people add new functionality as needed. Most commonly this happens via some macro expansion functionality added to "reduce boilerplate".

Perhaps the best known example of this kind of tool is Make. At its core it is only a DAG solver and you can have arbitrary functionality just by writing shell script fragments direcly inside your Makefile. The underlying idea is to give developers the tools they need to solve their own special problems. These kinds of accidental programming environments are surprisingly common. I was told that a certain Large Software Company that provides services over the web did a survey on how many self created Turing complete configuration languages they have within their company. I never managed to get an actual number out of them, which should tell you that the real answer is probably "way too many".

Still, Make has stood the test of time as have many other projects using this approach. Like everything in life this architecture also has its downsides. The main one can be split in two parts, the first of which is this:
Even though this approach is easy to get started and you can make it do anything (that is what Turing completness does after all), it fairly quickly hits a performance ceiling. Changing the configuration and adding new features becomes ever more tedious as the size of the project grows. This is known as the Turing tarpit, where everything is possible but nothing is easy.

In practice this seems to happen quite often. Some would say every time. To understand why, we need to remember Jussi's law of programmers:
The problem with programmers is that if you give them the chance, they will start programming.
An alternative formulation would be that if you give programmers the possibility to solve their own problems, they are going to solve the hell out of their own problems. These sorts of architectures play the strings of the classical Unix hacker's soul. There are no stupid limitations, instead the programmer is in charge and can fine tune the solution to match the exact specifics and details of their unique snowflake project. The fact that the environment was not originally designed for general programming is not a deterrent, but a badge of merit. Making software do things it was not originally designed to do is what the original hacker ethic was all about, after all. This means that every project ends up solving the same common problems in a different way. This is a violation of the DRY principle over multiple projects. Even though every individual project only solves any problem once, globally they get solved dozens or even hundreds of times. The worst part is that none of these solutions work together. Taking to arbitrary Make projects and trying to build one as a subproject of another will not work and will only lead to tears.

Long term: maximal cooperation

An alternative approach is to create some sort of a configuration interface which is specific to the problem at hand but prevents end users from doing arbitrary operations. It looks like this.
The main thing to note is that at the beginning this approach is objectively worse. If you start using one of these at the beginning it is entirely possible that it can't solve your problem, and you can't make it work on your own, you need to change the system (typically by writing and sending pull requests). This brings enormous pressure to use an existing tweakable system because even though it might be "worse" at least it works.

However once this approach gets out of its Death Valley and has support for the majority of things users need, then networks effects kick in and everything changes. It turns out that perceived snowflakeness is very different from actual snowflakeness. Most problems are in fact pretty much the same for most projects and the solutions for those are mostly boring. Once there is a ready made solution, people are quite happy to use that, meaning that any experience you have in one project is immediately transferable to all other projects using the same tool. Additionally as general programming has been prevented, you can be fairly sure [1] that no cowboy coder has gone and reimplemented basic functionality just because they know better or, more probably, have not read the documentation and realized that the functionality they want already exists.

[1] Not 100% sure, of course, because cowboy coders can be surprisingly persistant.

Wednesday, May 13, 2020

The need for some sort of a general trademark/logo license

As usual I am not a lawyer and this is not legal advice.

The problem of software licenses is fairly well understood and there are many established alternatives to choose from based on your needs. This is not the case for licenses governing assets such as images and logos and especially trademarks. Many organisations, such as Gnome and the Linux Foundation have their own trademark policy pages, but they seem to be tailored to those specific organizations. There does not seem to be a kind of a "General project trademark and logo license", for lack of a better term, that people could apply to their projects.

An example case: Meson's logo

The Meson build system's name is a registered trademark. In addition it has a logo which is not. The things we would want to do with it include:
  • Allow people to use to logo when referring to the project in an editorial fashion (this may already be a legal right regardless, in some jurisdictions at least, but IANAL and all that)
  • Allow people to use the logo in other applications that integrate with Meson, in particular IDEs should be able to use it in their GUIs
  • People should be allowed to change the image format to suit their needs, logos are typically provided as SVGs, but for icon use one might want to use PNG instead
  • People should not be allowed to use the logos and trademarks in a way that would imply they are endorsing any particular product or service
  • People should not be allowed to create and sell third party merchandising (shirts etc) using the logo
  • Achieve all this while maintaining compliance with DFSG, various corporate legal requirements, established best practices for trademark protection and all that.
Getting all of these at the same time is really hard. As an example the Creative Commons licenses can be split in two based on whether they permit commercial use. All those that do permit it fail because they (seem to) permit the creation and sales of third party merchandise. Those that prohibit commercial are problematic because they prevent companies from shipping a commercial IDE product that uses the logo to identify Meson integration (which is something we do want to support, that is what a logo is for after all). This could also be seen as discriminating against certain fields of endeavour, which is contrary to things like the GPL's freedom zero and DFSG guideline #6.

Due to this the current approach we have is that logo usage requires individual permission from me personally. This is an awful solution, but since I am just a random dude on the Internet with a lawyer budget of exactly zero, it's about the only thing I can do. What would be great is if the entities who do have the necessary resources and expertise would create such a license and would then publish it freely so FOSS projects could just use it just as easily as picking a license for their code.

Monday, May 11, 2020

Enforcing locking with C++ nonmovable types

Let's say you have a struct with some variable protected by a mutex like this:

struct UnsafeData {
  int x;
  std::mutex ;
};

You should only be able to change x when the mutex is being held. A typical solution is to make x private and then create a method like this:

void UnsafeData::set_x(int newx) {
  // WHOOPS, forgot to lock mutex here.
  x = newx;
}

It is a common mistake that when code is changed, someone, somewhere forgots to add a lock guard. The problem is even bigger if the variable is a full object or a handle that you would like to "pass out" to the caller so they can use it outside the body of the struct. This caller also needs to release the lock when it's done.

This brings up an interesting question: can we implement a scheme which only permits safe accesses to the variables in a way that the users can not circumvent [0] and which has zero performance penalty compared to writing optimal lock/unlock function calls by hand and which uses only standard C++?

Initial approaches

The first idea would be to do something like:

int& get_x(std::lock_guard<std::mutex> &lock);

This does not work because the lifetimes of the lock and the int reference are not enforced to be the same. It is entirely possible to drop the lock but keep the reference and then use x without the lock by accident.

A second approach would be something like:

struct PointerAndLock {
  int *x;
  std::lock_guard<std::mutex> lock;
};

PointerAndLock get_x();

This is better, but does not work. Lock objects are special and they can't be copied or moved so for this to work the lock object must be stored in the heap, meaning a call to new. You could pass that in as an out-param but those are icky. That would also be problematic in that the caller creates the object uninitialised, meaning that x points to garbage values (or nullptr). Murpy's law states that sooner or later one of those gets used incorrectly. We'd want to make these cases impossible by construction.

The implementation

It turns out that this has not been possible to do until C++ added the concept of guaranteed copy elision. It means that it is possible to return objects from functions via neither copy or a move. It's as if they were automatically created in the scope of the calling function. If you are interested in how that works, googling for "return slot" should get you the information you need.  With this the actual implementation is not particularly complex. First we have the data struct:

struct Data {
    friend struct Proxy;
    Proxy get_x();

private:
    int x;
    mutable std::mutex m;
};

This struct only holds the data. It does not manipulate it in any way. Every data member is private, so the struct itself and its Proxy friend can poke them directly. All accesses go via the Proxy struct, whose implementation is this:

struct Proxy {
    int &x;

    explicit Proxy(Data &input) : x(input.x), l(input.m) {}

    Proxy(const Proxy &) = delete;
    Proxy(Proxy &&) = delete;
    Proxy& operator=(const Proxy&) = delete;
    Proxy& operator=(Proxy &&) = delete;


private:
    std::lock_guard<std::mutex> l;
};

This struct is not copyable or movable. Once created the only things you can do with it are to access x and to destroy the entire object. Thanks to guaranteed copy elision, you can return it from a function, which is exactly what we need.

The creating function is simply:

Proxy Data::get_x() {
    return Proxy(*this);
}

Using the result feels nice and natural:

void set_x(Data &d) {
    // d.x = 3 does not compile
    auto p = d.get_x();
    p.x = 3;
}

This has all the requirements we need. Callers can only access data entities when they are holding the mutex [1]. They do not and in deed can not release the mutex accidentally because it is marked private. The lifetime of the variable is tied to the life time of the lock, they both vanish at the exact same time. It is not possible to create half initialised or stale Proxy objects, they are always valid. Even better, the compiler produces assembly that is identical to the manual version, as can be seen via this handy godbolt link.

[0] Unless they manually reinterpret cast objects to char pointers and poke their internals by hand. There is no way to prevent this.

[1] Unless they manually create a pointer to the underlying int and stash it somewhere. There is no way to prevent this.

Monday, May 4, 2020

Let's talk meta

In my previous blog post about old techs causing problems in getting new developers on board. In it I had the following statement:
As a first order approximation, nobody under the age of 35 knows how to code in Perl, let alone would be willing to sacrifice their free time doing it.
When I wrote this, I spent a lot of time thinking whether I should add a footnote or extra sentence saying, roughly, that I'm not claiming that there are no people under 35 who know Perl, but that it is a skill that has gotten quite rare compared to ye olden times. The reason for adding extra text is that I feared that someone would inevitably come in and derail the discussion with some variation of "I'm under 35 and I know Perl, so the entire post is wrong".

In the end I chose not to put the clarification in the post. After all it was written slightly tongue-in-cheek, and even specifically says that this is not The Truth (TM), but just an approximation. The post was published. It got linked on a discussion forum. One of the very first comments was this:


This is what makes blogging on the Internet such a frustrating experience. Every single sentence you write has to be scrutinised from all angles and then padded and guarded so its meaning can not be sneakily undermined in this way. This is tiring, as it is difficult to get a good writing flow going. It may also make the text less readable and enjoyable. It makes blogging less fun and thus people less likely to want to do it.

An alternative to this is to not ready any comments. This works, but then you are flying blind. You can't tell what writing is good and which is not and you certainly can't improve. The Internet has ruined everything.

Meta-notes

Contrary to the claim made above, the Internet has not, in fact, ruined everything. The statement is hyperbole, stemming from the author's feelings of frustration. In reality the Internet has improved the quality of life of most people on the earth by a tremendous amount and should be considered as one of the greatest inventions of mankind.

"Ye olden times" was not written as "├że olden times" because in the thorny battle between orthographic accuracy and readability the latter won.

The phrase "flying blind" refers neither to actual flying nor to actual blindness. It is merely a figure of speech for any behaviour that is done in isolation without external feedback. You should never operate any vehicle under any sort of vision impairment unless you have been specifically trained and authorized to do so by the appropriate authorities.

Meta-meta-notes

The notes above were not written because the author thought that readers would take the original statements literally. Instead they are there to illustrate what would happen if the defensive approach to writing, as laid out in the post, were taken to absurd extremes. It exists purely for the purposes of comedy. As does this chapter.