Saturday, January 29, 2022

Porting the LO Windows build to macOS

In the last blog post we looked at compiling a subsection of LibreOffice on Windows using nothing but Meson and WrapDB. Now that it is working (somewhat) the obvious followup question is how much work would it be to make that build on macOS as well.

Not all that much, it turns out. The work consisted mostly of adding some defines, adding platform-specific source files and the like. It took me less than a day. A sizable fraction of that was spent waiting for my old 4-core laptop to finish compiling the code.

Thursday, January 27, 2022

Building a part of LibreOffice on Windows using only Meson and WrapDB

In earlier posts (starting from this one) I ported LibreOffice's build system to Meson. The aim has not been to be complete, but to compile and link the main executables. On Linux this is fairly easy as you can use the package manager to install all dependencies (and there are quite a few of them).

One of the goals of WrapDB has been to provide external dependencies automatically on platforms lacking system dependencies (and even on Linux if you need newer dependency versions than your distro provides). It is already being used by many people and from what I've been told it works fairly well. When it comes to bigger projects like LO, there have been two major opposing view:

  1. That just can't work.
  2. Even if it did work, converting all dependencies would be way too much work so it could never be done.
There is only one way to counter opinions such as these and that is to do the actual work. So I set out to build LO Writer and all its dependencies using nothing but Visual Studio, Meson and WrapDB. There is to be no MinGW, Msys, Cygwin or any other unix compatibility layer.

The porting work was straightforward. Start compilation, wait for an error, typically due to missing dependencies. Then port that to Meson, submit it to WrapDB and continue.

Did it succeed?

Yes. In a way. Here's a screen shot of the extracted subprojects that were downloaded via WrapDB.

Does it run?

Lol no. It did not even run on Linux properly, because LO requires a ton of configuration files to be installed "just right" in order to start and that part had never been compiled.

Does it compile?

It does on my machine. It probably won't do so on yours. Some of the deps I used could not be added to WrapDB yet or are missing review. If you want to try, the code is here.

The problematic (from a build system point of view) part of compiling an executable and then running it to generate source code for a different target works without problems. In theory you should be able to generate VS project files and build it with those, but I only used Ninja because it is much faster.

What was hard?

The nemesis of any porting effort of LO is the i18npool subdirectory. It builds programs that convert hyphenation rules from XML files to code. It uses the ICU library for that. The basic problem of Windows is that there is no concept of RPATH (unless you fake it in) so if your binaries use shared libraries then you can't just run them. Fortunately Meson handles this transparently by wrapping binary invocations and does all the needed PATH magling needed to make things work.

However ICU's hyphenation programs are special. They also need to access some data files. On a system-wide install they are read from the common directories, but they are not available when building yourself. There are command line options to point the programs to the proper place but at the time I got frustrated and just copied the pregenerated source file from a Linux build and called it a day.

I had to do the same thing for the outputs of Flex, Bison and gperf for similar reasons. These are all fixable, but some of the generator bits also use cthulhuan shell pipelines to do "stuff". These would need to be converted to Python for portability (and also readability).

Boost

LO uses a lot of Boost. I suspected this to be a problem but fortunately it did not. Most code uses the header-only parts so those all get set by a single declare_dependency. There were a couple of uses of libraries that require actual code. One of them was for Boost Filesystem. Assuming the code does not do anything weird, that could probably be fixed to use std::filesystem instead.

The Boost code is copied in the LO repo for now. It is not added to WrapDB yet as it is quite incomplete and only builds for this use case. Still, Boost is a popular dependency so maybe having it in WrapDB would be useful, even in an incomplete state.

Could a full port be made?

Let's say that thus far there has been nothing to indicate that it would not work. The downside is that it would be a fair amount of work and it is not the cool kind where you get to write new features but instead it is the equivalent of ditch digging. Even more problematically it probably could not be done by "one person on their own" but would instead require buy-in and cooperation from a large group of developers. As people are perennially busy, getting the necessary resources would probably be challenging.

All of that being said, there is a GSoC project for doing a porting experiment. So if you are the sort of person who won't shy away from a challenge, you might consider applying.

Bonus question

How many XML parsers does LO have?

The first one is libxml. The second one is Expat. The third one is Boost's Property tree, which has its own parser (according to the docs at least, dunno if it is used in this code). The fourth one is the bunch of Awk regexps that are used in the build scripts buried inside Makefiles.

There may be more.

Saturday, January 8, 2022

Portability is not sufficient for portability


A forenote

This blog post has some examples of questionable quality. This should not be meant as an attack on those projects. The issues listed here are fairly widespread, these are just the examples I ran into while doing other work.

What is meant by portability?

Before looking into portable software, let's first examine portability from a hardware perspective. When you ask most people what they consider a "portable computer", they'll probably think of laptops or possibly even a modern smartphone. But what about this:

This is most definitely a computer (the one I'm using to write this blog post, in fact), but not portable. It weighs something on the order of 10 kilos and it is too big to comfortably wrap your hands around.

And yet, I have carried this computer from one end of the Helsinki metropolitan region to another. It took over an hour on a train and a subway. When I finally got it home my arms were so exhausted that for a while I could not even lift them up and all muscles in them were sore for several days. I did use a helper carry strap, but it did not help much.

So in a way, yes, this computer is portable. It's not really designed for it, the actual transport process is a long painful slog and if you make an accidental misstep and bump it against a wall you run the risk of breaking everything inside. But it is a "portable computer" as a single person can carry it from one place to another using nothing but their own muscles.

Tying this to software portability

There is a lot of software out there that claims to be "portable" but can only be said to be that in the same way as the computer shown above is "portable". For the rest of the post we're only going to focus on portability to Windows.

Let's say a project has a parser that is built with Lex and Bison. Thus you need to have those programs during compilation. Building them from source is problematic on Windows (because of, among other things, Autotools) so it would be nice to get some prebuilt binaries for them. After a bit of googling you might find this page which provides Windows binaries. That has last been updated in 2004. So no.

You could also install Msys2 and get the binaries with Pacman. If you are using Visual Studio and just want to build the thing, installing a whole separate userspace system and package manager just to get two executables seems like bit of an overkill. Thinking about it further you might realize that you could install Msys on some other machine, get the executables, copy them and their direct dependency DLLs to your machine and put them in PATH. If you try this, the binaries segfault on run, probably because they can't access their localisation files that are "somewhere". 

Is this piece of software portable to Windows? Yes it is, in the "desktop PC is portable" sense, but definitely not in the "a laptop is portable" sense.

As an another example let's look at the ICU project. It claims to be highly portable, and it kind of is, here is a random snippet from their highly portable Makefile.

I don't know about you, but just looking at that … thing gives me a headache. If you ever need to do something the existing functionality does not provide, then trying to decipher what that thing is doing is an exercise in masochism. For Windows this is relevant, because Visual Studio only ships with nmake, which is not at all compatible with Make so you need to decrypt absolutely everything.

Again, this is portable to Windows, you just need to prebuild it with MinGW or using the provided VS solution files, copying the libraries from one place to another and using them. This is very much "desktop PC portable" again.

Sometimes you don't get even that. Take for example the liblangtag project. It is a fairly typical dependency library that provides a single shared library. It even hides its symbols and only exports those belonging to the public API. Sadly it does this using Libtool magic postprocessing. On Windows you have to annotate exported symbols with magic markers. Thus it is actually impossible to build a shared library properly on VS without making source code changes[1]. Thus you have to go the Mingw build route here. But that is again "portable" as in if you spend a ton of time and effort then you can sorta kinda make it work in a rubegoldbergesque way.

Being more specific

Due to various reason I have had to deal with the innards of libraries of different vintage. Fairly often the experience has been similar to dragging my desktop computer across town: arduous, miserable and exhausting. It is something I would wish upon my worst enemy and also upon most of my lesser enemies. It would serve them right.

In my personal opinion saying that some piece of code is portable should imply some basic ease of use. If you need to spend time fighting with it to make it work on an uncommon platform, toolchain or configuration then the thing is not really portable. It also blocks adoption, because if some library is a massive pain to use, people will prefer to reimplement the functionality or use some other library, just to get away from the pain.

Since changing the generally accepted meanings of words is unlikely to work, this won't happen. So in the mean time when you are talking about portability with someone else, do be sure to specify whether you mean "portable as in a desktop gaming PC" or "portable as in a laptop".

How this blog post came about

Some years ago I ported a sizable fraction of LibreOffice to build with Meson. It worked only on Linux as it used system dependencies. I rebased it to current trunk and tried to see if it could be built using nothing but Visual Studio by getting dependencies via the WrapDB. This repo contains the code, which now actually does build some code including dependencies like libxml, zlib and icu.

The code that is there is portable in the laptop sense. You only need to do a git checkout and start the build in a VS x64 dev tools prompt. It does cheat in some points, such as using pregenerated flex + bison sources, but it's not meant to be production quality, just an experiment.

[1] The project in question seems to have more preprocessor macro magic definitions than actual code, so it is possible there is some combination of defines that makes this work. If so, I did not manage to find it. This is typical in many old school C projects.