Thursday, December 31, 2015

Are exceptions slower than error objects

One stated reason for using error objects/error codes instead of exceptions is that they are faster. The real question, then, are they. I wrote a bunch of C and C++ test applications and put them up on Github.

The tests are simple. They call a function that either does nothing or throws an exception/returns an error object depending on its argument in a tight loop. The implementation is put in a separate file and LTO is not used so the compiler can't inline the code and thus optimize away the checks.

The error objects used mimic what GLib's GError does. That is, it contains a dynamically allocated string describing the error (a dummy string in our case). On all error cases the created error object needs to be deallocated properly.

A sample run on OSX looks like this (output directly from Meson's ninja benchmark command):

  1/6 exceptions, not thrown                  OK      0.00520 s +- 0.00009 s
  2/6 exceptions, thrown                      OK      1.65206 s +- 0.01488 s
  3/6 error object, not created               OK      0.00563 s +- 0.00017 s
  4/6 error object, created                   OK      0.19703 s +- 0.00769 s
  5/6 5/loop, exceptions, no throws           OK      0.00528 s +- 0.00017 s

  6/6 5/loop, error object, no errors         OK      0.00669 s +- 0.00072 s

Here we see that when errors happen exceptions (case 2) are 10x slower than error objects (case 4). We can also see that when errors do not happen, exceptions (case 1) are measurably faster than exception objects (case 3).

Cases 5 and 6 are the same as 1 and 3, respectively, except that they call the function five times in a single loop iteration. This does not change the end result, exceptions are still a bit faster.

I ran this test on a bunch of machines and compilers, such as OSX, Linux, Windows, gcc, clang, vs, Raspberry Pi and so on and the results were consistent. When not thrown, exceptions were never slower and usually were a bit faster than error objects. The only exception was MinGW on Windows XP, where exceptions were noticeably slower even when not thrown. Thrown exceptions were always roughly 10x slower than error objects.

The reason for this is (probably, I haven't looked at the resulting assembly) the fact that the error object code always has a branch operation that slows down the CPU. Exceptions have an optimization for this case which avoids the need for a comparison.

Pulling all that together we can ask whether you should use exceptions in high performance code or not. And the answer is maybe. If you have a case where exceptions are rare, then they might very well be faster. This is very data dependent so if this an issue that really matters to you, run your own benchmarks and make the decision based on measurements rather than wild guessing.

Monday, December 21, 2015

Always use / for directory separation

It is common wisdom that '\' is the path separator on Windows and '/' is the separator on unixlike operating systems. It is also common wisdom that you should always use the native path separator in your programs to minimise errors.

The first one of these is flat out wrong and the second one can be questioned.

On Windows both '\' and '/' work as directory separators. You can even mix them within one path. The end result is that '/' is the truly universal directory separator, it works everywhere (except maybe classic Mac OS, which is irrelevant).

So the question then is whether you should use '\' or '/' when dealing with paths on Windows. I highly recommend always using '/'. The reason for this is a bit unexpected but makes perfect sense once you get it.

The difference between these two characters is that '\' is a quoting character whereas '/' is not. You probably have to pass your path strings through many systems (bat files, persist to disk, files, sql, etc) using libraries and frameworks outside your control. They will most likely have bugs in the way they handle quotes. You may need to quote your quote chars to get them to work. You might need to quote them differently in different parts of your program. Quoting issues are the types of bugs that strike you at random, usually in production and are really difficult to hunt down.

So always use '/' if possible. It will survive any serialisation/deserialisation combination without problems since '/' is not a special character.