sunnuntai 14. lokakuuta 2018

Things Microsoft could do to make life of developers easier

A few weeks ago I was at CppCon. One of the presentations was about new stuff in the Visual Studio compiler. The presentation had this slide fairly early on.

Presentation screenshot saying the mission of the C++ team is to make the lives of all C++ developers on the planet better.

If that is truly their goal, then here are some things they could do. (Some not specifically about C++ but still related.)

Proper RPATH support

If you have a project that uses shared libraries and you want to run it directly from the build directory, then you really need to have rpath or something similar to it. A simple way of explaining it is that you add a piece of text inside an executable saying "when running me, search for foo.dll in directory ../baz/lib.

Since this is not natively supported, people need to resort to awful hacks to make it work:
  • adjusting the PATH envvar to contain the dirs where the dlls are (because PATH is used to look up dlls)
  • copy all files to the same directory before running
  • creating a manifest file defining an internal bundle, creating a subdirectory and copying all dependency dlls there
  • static linking everything
  • mandating a project layout where everything is in one subdirectory
All of these are nasty hacks. It should be possible to run programs straight from the build dir without needing to copy anything or change envvars.

During CppCon I was told that with the very latest Windows 10 it should be possible to do this "somehow" but googling for this has not uncovered any instructions.

Spawning a process using an array

The way to spawn processes in Windows is using the CreateProcess function. Note that it takes a command string, not an array. The command implementation will then parse the string to a command array and run it. The documentation page does not document how the parsing is done, but presumably it is the same as what cmd.exe does.

What this means is that it is impossible to spawn a process on Windows without needing to jump through massive quoting hoops. For example suppose you want to write a Ninja file to call a specific command. Because of this problem Ninja does not support arrays natively but instead requires every user to write a single command string, which leads to double quoting. First you need to quote the array to be a Windows process spawning command and then you need to quote that according to Ninja's quoting rules.

And then it gets terrible.

The command line length limitation on Windows is ridiculously short. Even fairly simple link commands are too long. Thus you need to write the actual command to a response file, which the command then reads and parses on its own. Since every program writes their own parsing and splitting code, you may find that you need to quote things differently depending on whether you are using the command line or a response file. You get one guess whether some (but not all) programs coming from Unix parse their response files according to Unix shell rules, even on Windows.

Now there might be a few people out there who just got outraged, because msvcrt does in fact have functions to spawn processes with arrays. They are a complete lie. Here is a rough pseudocode representation on how they are implemented:

def spawn_process(command_array):
    command_line = ' '.join(command_array)
    return CreateProcess(command_line)

Support GCC's destructor extension in plain C

RAII is awesome. It is, in fact, so awesome that GCC ships an extension to use it with plain C. It is used by many plain C projects such as GLib and systemd. I have spoken to many C developers and they really love that feature and they absolutely hate that they can't use it in code that has to support MSVC.

Adding this support would be great and make the world a better place in several ways including:
  • you can use libraries that use this feature as dependencies when building with MSVC
  • multiplatform projects can start using destructors freely
  • all the millions of lines of C code that exist in the world (and which will not be rewritten any time soon) can be made iteratively safer and more reliable
Eventually it would be nice to get this feature in the C standard, but that is unlikely to happen any time soon.

Performance optimize MSBuild

Running the test suite of Meson with the Visual Studio compiler takes roughly 6-7 minutes when using the Ninja backend and 14-18 minutes when using the MSBuild backend. Granted, this is a worst case scenario of running many small independent builds in a row, but it is still frustratingly slow. The same can be found when using Visual Studio IDE. After typing ctrl-shift-b there is usually a noticeable lag until any compilation actually starts.

Kill the need for vcvarsall.bat and provide parallel installable compilers

Visual studio compilers are not in path by default. You have to either start a special shell or run a magic bat file from a magic directory that sets up the environment so that the compilers work. If you go looking in the installed directory there are many different directories all of which contain an executable cl.exe. Which one you run depends on PATH settings, thus you can only run one compiler at a time. This makes it really difficult to, for example, run multiple different VS versions (15, 17, native, cross etc) from a single script.

This same problem has been solved on Unix side ages ago. The trick is to provide many executables with different names. For example cl15-x86.exe, cl17-arm.exe and cl17-x64.exe. Each of these executables would set up the equivalent of vcvarsall.bat for its own process and then forward the actual compilation to the compiler, wherever it may be hidden in the file system hierarchy. These binaries could the be put in one single path location and they could be used from any command prompt, even in parallel. This is particularly useful for cross compilation projects where you need to build a code generator with the native compiler and then use it to generate source code for the cross compiler.

Have you reported these as bugs upstream?

No. Nothing on this blog post is new, these are all issues that have been known for 20+ years and most likely have been reported to Microsoft dozens, if not hundreds of times. The fact that these things have not been fixed is a question of corporate priorities. As a random-non-windows-using-dude-on-internet I don't really have any influence on those.

1 kommentti: