Sunday, April 25, 2021

The joys of creating Xcode project files

Meson's Xcode backend was originally written in 2014 or so and was never complete or even sufficiently good for daily use. The main reason for this was that at the time I got extremely frustrated with Xcode's file format and had to stop because continuing with would have lead to a burnout. It's just that unpleasant. In fact, when I was working on it originally I spontaneously said the following sentence out loud:

You know, I really wish this file was XML instead.

I did not think these words could be spoken with a straight face. But they were.

The Xcode project file is not really even a "build file format" in the sense that it would be a high level description of the build setup that a human could read, understand and modify. Instead it seems that Xcode has an internal enterprise-quality object model and the project file is just this data structure serialised out to disk in a sort-of-like-json-designed-in-1990 syntax. This format is called a plist or a property list. Apparently there is even an XML version of the format, but fortunately I've never seen one of those. Plists are at least plain text so you can read and write them, but sufficiently like a binary that you can't meaningfully diff them which must make revision control conflict resolution a joy.

The semantics of Xcode project files are not documented. The only way to really work with them is to define simple projects either with Xcode itself or with CMake, read the generated project file and try to reverse-engineer its contents from those. If you get it wrong Xcode prints a useless error message. The best you can hope for is that it prints the line number where the error occurred. Often it does not.

The essentials

An Xcode project file is essentially a single object dictionary inside a wrapper dictionary. The keys are unique IDs and the values are dictionaries, meaning the whole thing is a string–variant wrapper dictionary containing a string–variant dictionary of string–variant dictionaries. There is no static typing, each object contains an isa key which has a string telling what kind of an object it is. Everything in Xcode is defined by building these objects and then by referring to other objects via their unique ids (except when it isn't, more about this later).

Since everything has a unique ID, a reasonable expectation would be that a) it would be unique per object and b) you would use that ID to refer to the target. Neither of these are true. For example let's define a single build target, which in Xcode/CMake parlance is a PBXNativeTarget. Since plist has a native array type, the target's sources could be listed in an array of unique ids of the files. Instead the array contains a reference to a PBXSourcesBuildPhase object that has a few keys which are useless and always the same and an array of unique ids. Those do not point to file objects, as you might expect, but to a PBXBuildFile object, whose contents look like this:

24AA497CCE8B491FB93D4C76 = {
  isa = PBXBuildFile;
  fileRef = 58CFC111B9B64310B946BCE7 /* /path/to/file */;
};

There is no other data in this object. Its only reason for existing, as far as I can tell, is to point to the actual PBXFileReference object which contains the filesystem path. Thus each file actually gets two unique ids which can't be used interchangeably. But why stop at two?

In order to make the file appear in Xcode's GUI it needs more unique ids. One for the tree widget and another one it points to. Even this is not enough, though, because if the file is used in two different targets, you can not reuse the same unique ids (you can in the build definition, but not in the GUI definition just to make things more confusing). The end result being that if you have one source file that is used in two different build targets, then it gets at least four different unique id numbers.

Quoting

In some contexts Xcode does not use uids but filenames. In addition to build targets Xcode also provides PBXAggregateTargets, which are used to run custom build steps like code generation. The steps are defined in a PBXShellScriptBuildPhase object whose output array definition looks like this:

outputPaths = (
    /path/to/output/file,
);

Yep, those are good old filesystem paths. Even better, they are defined as an actual honest to $deity array rather than a space separated string. This is awesome! Surely it means that Xcode will properly quote all these file names when it invokes external commands.

Lol, no!

If your file names have special characters in them (like, say, all of Meson's test suite does by design) then you get to quote them manually. Simply adding double quotes is not enough, since they are swallowed by the plist parser. Instead you need to add additional quoted quote characters like this: "\"foo bar\"".  Seems simple, but what if you need to pass a backslash through the system, like if you want to define some string as "foo\bar"? The common wisdom is "don't do that" but this is a luxury we don't have, because people will expect it to work, do it anyway and report bugs when it fails.

To cut a long and frustrating debugging session short, the solution is that you need to replace every backslash with eight backslashes and then it will work. This implies that the string is interpreted by a shell-like thing three times. I could decipher where two of them occur but the third one remains a mystery. Any other number of backslashes does not work and only leads to incomprehensible error messages.

Quadratic slowdowns for great enjoyment

Fast iterations are one of the main ingredients of an enjoyable development experience. Unfortunately Xcode does not provide that for this use case. It is actually agonizingly slow. The basic Meson test suite consists of around 240 sample projects. When using the Ninja backend it takes 10 minutes to configure, build and run the tests for all of them. The Xcode backend takes 24 minutes to do the same. Looking at the console when xcodebuild starts it first reads its input files, then prints "planning build", pauses for a while and then starts working. This freeze seems to take longer than Meson took configuring and generating the project file. Xcodebuild lags even for projects that have only one build target and one source file. It makes you wonder what it is doing and how it is even possible to spend almost a full second planning the build of one file. It also makes you wonder how a pure Python program written by random people in their spare time outperforms the flagship development application created by the richest corporation in the world.

Granted, this is a very uncommon case as very few people need to regenerate their project files all the time. Still, this slowness makes developing an Xcode backend a tedious job. Every time you add functionality to fix one test case, you have to rerun the entire test suite. This means that the further along you get, the slower development becomes. By the end I was spending more time on Youtube waiting for tests to finish than I did writing code.

Final wild speculation

Xcode has a version compatibility selection box that looks like this:

This is extremely un-Apple-like. Backwards compatibility is not a thing Apple has ever really done. Typically you get support for the current version and, if you are really lucky, the previous one. Yet Xcode has support for versions going all the way back to 2008. In fact, this might be the product with the longest backwards compatibility story ever provided by Apple. Why is this?

We don't really know. However being the maintainer of a build system means that sometimes people tell you things. Those things may be false or fabricated, of course, so the following is just speculation. I certainly have no proof for any of it. Anyhow it seems that a lot the fundamental code that is used to build macOS exists only in Xcode projects created ages ago. The authors of said projects have since left the company and nobody touches those projects for fear of breaking things. If true this would indicate that the Xcode development team has to keep those old versions working. No matter what.

3 comments:

  1. “xml” plist is worse than “ascii” plist imo

    ReplyDelete
  2. Apple's file formats and directory structures are amateur-hour. What do you expect from a company whose every application package is a directory that contains sole subdirectory called "Contents." Yes, Apple thinks you need a subdirectory called "Contents" to hold the contents of the parent.

    And let's not get started on FCP's brain-dead XML format (athough, to be fair, it came from another Mac-centric house of doofuses, Macromedia).

    ReplyDelete
  3. plutil lets you convert between xml, binary and other formats.

    ReplyDelete