Nibble Stew: April 2020

Sunday, April 19, 2020

Do humans or compilers produce faster code?

Modern optimizing compilers are truly amazing. They have tons and tons of tricks to make even crappy code run incredibly fast. Faster than most people could write by hand. This has lead some people to claim that program optimization is something that can be left to compilers, as they seem to be a lot better at it. This usually sparks a reply from people on the other end of the spectrum that say that they can write faster code by hand than any compiler which makes compilers mostly worthless when performance actually matters.

In a way both of these viewpoints are correct. In another way they are both wrong. To see how, let's split this issue into two parts.

A human being can write faster code than a compiler for any given program

This one is fairly easy to prove (semi)formally. Suppose you have a program P written in some programming language L that runs faster than any hand written version. A human being can look at the assembly output of that program and write an equivalent source version in straight C. Usually when doing this you find some specific optimization that you can add to make the hand written version faster.

Even if the compiler's output were proved optimal (such as in the case of superoptimization), it can still be matched by copying the output into your own program as inline assembly. Thus we have proven that for any program humans will always be faster.

A human being can not write faster code than a compiler for every program

Let's take something like Firefox. We know from the previous chapter that one could eschew complex compiler optimizations and rewrite it in straight C or equivalent and end up with better performance. The downside is that you would die of old age before the task would be finished.

Human beings have a limited shelf life. There are only so many times they can press a key on the keyboard until they expire. Rewriting Firefox to code that works faster with straight C than the current version with all optimizations enabled is just too big of a task.

Even if by some magic you could do this, during the rewrite the requirements on the browser would change. A lot. The end result would be useless until you add all the new functionality that was added since then. This would lead to eternal chasing of the tail lights of the existing project.

And even if you could do that, optimizing compilers keep getting better, so you'd need to go through your entire code base regularly and add the same optimizations by hand to keep up. All of these things could be done in theory, but they completely impossible in practice.

The entire question poorly posed

Asking whether compilers and humans write faster code is kind of like asking which one is "bluer", the sea or the sky. Sure you could spend years debating the issue on Twitter without getting anywhere, but it's not particularly interesting. A more productive way is to instead ask the question "given the requirements, skills and resources I have available, should I hand-optimize this particular routine or leave it to the compiler".

If you do this you rediscover the basic engineering adage: you get the best bang for the buck by relying on the compiler by default and doing things by hand only for bottlenecks that are discovered by profiling the running application.

PS. Unoptimized C is slow, too

Some people think that when they write C it is "very close" to the underlying assembly and thus does not benefit much from compiler optimizations. This has not been true for years (possibly decades). The performance difference between no optimization and -O2 can be massive, especially for hot inner loops.

When people say that they can write code that is faster than compiler optimized version of the same algorithm in some other language, that is not what they are actually saying. Unless they are writing 100% pure ASM by hand [0] that is not what they are saying. Instead they are saying "I can take any algorithm implementation, write it with an alternative syntax and, when both of these are run through their respective optimizing compilers, end up with a program that runs faster".

[0] Which does happen sometimes, especially for SIMD code.

Saturday, April 11, 2020

Your statement is 100% correct but misses the entire point

Let's assume that there is a discussion going on on the Internet about programming languages. One of the design points that come up is a garbage collector. One participant mentions the advantages of garbage collection with something like this:

Garbage collectors are nice and save a lot of work. If your application does not have strict latency requirements, not having to care about memory management is liberating and can improve developer efficiency by a lot.

This is a fairly neutral statement that most people would agree with, even if they work on code that has strict real time requirements. Yet, inevitably, someone will present this counterpoint.

No! If you have dangling references memory is never freed and you have to fix that by doing manual memory management anyway. Garbage collectors do not magically fix all bugs.

If you read through the sentences carefully you'll notice that every asserted statement in it is true. That is what makes it so frustrating to argue against. Most people with engineering backgrounds are quite willing to admit they are wrong when presented with evidence that their statements are not correct. This does not cover everyone, of course, as some people are quite willing to violently disagree with any and all facts that are in conflict with their pre-held beliefs. We'll ignore those people for the purpose of this post.

While true, that single sentence ignores all of the larger context of the issue, which contains points like the following:

Dangling reference out of memory errors are rare (maybe 1 in 10 programs?) whereas regular memory bugs like use-after-free, double free, off by one errors etc are very common (100-1000 in every program).
Modern GCs have very good profilers, finding dangling references is a lot easier than debugging stack corruptions.
Being able create things on a whim and just drop them to the floor makes programmers a lot more productive than forcing them to micromanage the complete life cycle of every single resource.
Even if you encounter a dangling reference issue, fixing it probably takes less time than would have gone to fixing memory corruption issues in a GCless version of the same app.

In brief, the actual sentence is true but misses the entire point of the comment they are replying to. This is sadly common on Internet debates. Let's see some examples.

Computer security

A statement like this:

Using HTTPS on all web traffic is good for security and anonymity.

might be countered with something like this:

That provides no real security, if the NSA want your data they will break into your apartment and get it.

This statement is again absolutely true. On the other hand if you are not the leader of a nation state or do regular business with international drug cartels, you are unlikely to be the target of a directed NSA offensive.

If you think that this is a stupid point that nobody would ever make, I agree with you completely. I have also seen it used in the real world. I wish I hadn't.

Bugs are caused by incompetents

High level programming languages are nice.

Programming languages that guard against buffer overruns is great for security and ease of development.

But not for everyone.

You can achieve the exact same thing in C, you just have to be careful.

This is again true. If every single developer on a code base is being 100% focused and 100% careful 100% of the time, then bug free code is possible. Reality has shown time and time again that it is not possible, human beings are simply not capable of operating flawlessly for extended periods of time.

Yagni? What Yagni?

There's the simple.

Processing text files with Python is really nice and simple.

And not so simple.

Python is a complete joke, it will fail hard when you need to process ten million files a second on an embedded microcontroller using at most 2 k of RAM.

Yes. Yes it does. In that use case it would be the wrong choice. You are absolutely correct. Thank you for your insight, good sir, here is a shiny solid gold medal to commemorate your important contribution to this discussion.

What could be the cause of this?

The one thing that school trains you for is that being right is what matters. If you get the answers right in your test, then you get a good grade. Get them wrong and you don't. Maybe this frame of mind "sticks on" once you leave school, especially given that most people who post these kinds of comments seem to be from the "smarter" end of the spectrum (personal opinion, not based on any actual research). In the real world being right is not a merit by itself. In any debate being right is important, of course, but the much more important feature is being relevant. That requires understanding the wider context and possibly admitting that something that is the most important thing in the world to you personally, might be completely irrelevant for the issue at hand.

Being right is easy. Being relevant is extremely difficult.

Sunday, April 5, 2020

Meson manual sales status and price adjustment

The sales dashboard of the Meson manual currently looks like this.

It splits up quite nicely into three parts. The first one is the regular sales from the beginning of the year, which is on average less than one sale per day.

The second part (marked with a line) indicates when I was a guest on CppCast talking about Meson and the book. As an experiment I created a time limited discount coupon so that all listeners could buy it with €10 off. As you can tell from the graph it did have an immediate response, which again proves that marketing and visibility are the things that actually matter when trying to sell any product.

After that we have the "new normal", which means no sales at all. I don't know if this is caused by the coronavirus isolation or whether this is the natural end of life for the product (hopefully the former but you can never really tell in advance).

Price reduction

Thus, effective immediately, the price of the book has been reduced to €24.95. You can purchase it from the official site.