Thursday, January 27, 2022

Building a part of LibreOffice on Windows using only Meson and WrapDB

In earlier posts (starting from this one) I ported LibreOffice's build system to Meson. The aim has not been to be complete, but to compile and link the main executables. On Linux this is fairly easy as you can use the package manager to install all dependencies (and there are quite a few of them).

One of the goals of WrapDB has been to provide external dependencies automatically on platforms lacking system dependencies (and even on Linux if you need newer dependency versions than your distro provides). It is already being used by many people and from what I've been told it works fairly well. When it comes to bigger projects like LO, there have been two major opposing view:

  1. That just can't work.
  2. Even if it did work, converting all dependencies would be way too much work so it could never be done.
There is only one way to counter opinions such as these and that is to do the actual work. So I set out to build LO Writer and all its dependencies using nothing but Visual Studio, Meson and WrapDB. There is to be no MinGW, Msys, Cygwin or any other unix compatibility layer.

The porting work was straightforward. Start compilation, wait for an error, typically due to missing dependencies. Then port that to Meson, submit it to WrapDB and continue.

Did it succeed?

Yes. In a way. Here's a screen shot of the extracted subprojects that were downloaded via WrapDB.

Does it run?

Lol no. It did not even run on Linux properly, because LO requires a ton of configuration files to be installed "just right" in order to start and that part had never been compiled.

Does it compile?

It does on my machine. It probably won't do so on yours. Some of the deps I used could not be added to WrapDB yet or are missing review. If you want to try, the code is here.

The problematic (from a build system point of view) part of compiling an executable and then running it to generate source code for a different target works without problems. In theory you should be able to generate VS project files and build it with those, but I only used Ninja because it is much faster.

What was hard?

The nemesis of any porting effort of LO is the i18npool subdirectory. It builds programs that convert hyphenation rules from XML files to code. It uses the ICU library for that. The basic problem of Windows is that there is no concept of RPATH (unless you fake it in) so if your binaries use shared libraries then you can't just run them. Fortunately Meson handles this transparently by wrapping binary invocations and does all the needed PATH magling needed to make things work.

However ICU's hyphenation programs are special. They also need to access some data files. On a system-wide install they are read from the common directories, but they are not available when building yourself. There are command line options to point the programs to the proper place but at the time I got frustrated and just copied the pregenerated source file from a Linux build and called it a day.

I had to do the same thing for the outputs of Flex, Bison and gperf for similar reasons. These are all fixable, but some of the generator bits also use cthulhuan shell pipelines to do "stuff". These would need to be converted to Python for portability (and also readability).

Boost

LO uses a lot of Boost. I suspected this to be a problem but fortunately it did not. Most code uses the header-only parts so those all get set by a single declare_dependency. There were a couple of uses of libraries that require actual code. One of them was for Boost Filesystem. Assuming the code does not do anything weird, that could probably be fixed to use std::filesystem instead.

The Boost code is copied in the LO repo for now. It is not added to WrapDB yet as it is quite incomplete and only builds for this use case. Still, Boost is a popular dependency so maybe having it in WrapDB would be useful, even in an incomplete state.

Could a full port be made?

Let's say that thus far there has been nothing to indicate that it would not work. The downside is that it would be a fair amount of work and it is not the cool kind where you get to write new features but instead it is the equivalent of ditch digging. Even more problematically it probably could not be done by "one person on their own" but would instead require buy-in and cooperation from a large group of developers. As people are perennially busy, getting the necessary resources would probably be challenging.

All of that being said, there is a GSoC project for doing a porting experiment. So if you are the sort of person who won't shy away from a challenge, you might consider applying.

Bonus question

How many XML parsers does LO have?

The first one is libxml. The second one is Expat. The third one is Boost's Property tree, which has its own parser (according to the docs at least, dunno if it is used in this code). The fourth one is the bunch of Awk regexps that are used in the build scripts buried inside Makefiles.

There may be more.

2 comments:

  1. Over two years later, I still think LibreOffice should start working on a new build system. Autotools is in a maintenance mode and lacks man power to follow the development of C and C++ compilers in recent years. For the users (here: people interested in building and developing LibreOffice) the currently build system is overly complicated, difficult to debug, slow in building, and uses obscure languages like m4.
    I am a big fan of CMake. Nevertheless, thanks for your effort to try to convince the LibreOffice guys by providing some facts they could base a decision on. Unfortunately, they did not check your insights but consider Autotools good enough. *sigh*
    https://bugs.documentfoundation.org/show_bug.cgi?id=113121#c4

    ReplyDelete