Wednesday, March 20, 2024

Color management and API design

API design is hard. This is not a smashingly new revelation, but let's look at a sample issue I have been working on for CapyPDF. The main problem we are trying to solve is creating "print quality" PDFs. That is, ones that can be used to print things like books, magazines, posters and other high quality materials. A core component of this is color management, specifically the handling of ICC profiles for raster images.

There are at least four slightly conflicting design goals.

Fine-grained control

An advanced user knows and understands the PDF spec and know exactly how they want it to come out. The library should provide for this and not do, for example, unexpected color conversions behind the user's back.

Easy to use for basic cases

OTOH if your needs are simple, such as just loading images from files on disk, converting them to the output colorspace (almost certainly CMYK) with minimal fuss.


The API should be simple and readable. Even more importantly it should be understandable in the sense that when the user calls certain functions, they should be able to "know" what is going to happen and the behaviour should be the same over multiple invocations.


The API should prevent you from doing invalid things, such as using an uncalibrated RGB image in a CMYK document.

A wild real world appears!

Thus far things seem simple, but they get awfully complex. PDF is used in many different ways and all of those have their own requirements. For high quality printing specifically there is a specification called PDF/X that many printing shops use. Some might not even accept material that is not in this format. One of the requirements of PDF/X is that all raster images must be color managed. It would seem that a simple approach would be to convert all images to the output color space on load. And this is where things break down.

For you see, PDF does not have a single color managed pipeline, logically it has two. Grayscale images are "different" from full color images. A PDF generator must never convert grayscale raster images (or colors in general, but we'll focus on images now) to "color" images. Not even if the end result were "mathematically equivalent". In high quality printing that is not enough. Suppose you have a pixel whose gray value is 10. Converting that to CMYK can lead to (at least) two different values, (10, 10, 10, 0) and (0, 0, 0, 10). You'd think that the latter would always happen, but in testing LittleCMS produced the former (it also has custom gray-preserving transforms, but I did not try those). Even though these values are mathematically equivalent they (may) produce different output when printed. The latter is pure gray while the former can look muddled and if there are any registration problems the constituent colors might be visible. The RIP can not know whether the "grayscale looking color" was intentional or not. Under some circumstances it might be exactly what the creator intended, thus it can't really be post processed away. The only correct way is to keep the image in the gray color space so the RIP has maximal information to do its thing.

But this causes its own problem, because most grayscale images are not color managed. What should you do with those? Requiring color profiles would not be a nice UI, because then most images would break. For 1-bit grayscale images a color profile would not even make any sense. Not to mention that the grayscale image might not be printed at all but it instead used as an image mask for graphics composition operations (basically it would be used as the alpha channel). In that case you definitely want to use raw pixel values to obtain linear mixing. Doing gamma correction on your transparency channel could lead to some funky effects.

Things get more complicated once you realize that there are 7 variations of PDF/X that permit and prohibit different things. I tried to work out the workflow by writing a full table on color modes and output spaces and what should happen with every combination. Half way through I got a headache and had to stop.

Current status

The original plan was to make things happen automatically and try to validate the semantics of the output document as much as possible. That got simplified a whole lot. Because the state space is just so massive it might turn out that eventually CapyPDF only provides you the tools to do color conversions yourself and then writes out the result without trying to do anything fancy to it. It would then be the responsibility of the user to validate all semantic requirements.

All of this is to say that if you are currently using CapyPDF, just be aware that in the next version all APIs dealing with raster images have changed completely.

Monday, March 4, 2024

CapyPDF 0.9.0 released

I have just released CapyPDF 0.9.0. It can be obtained either via Github or PyPI.

There is no major big feature for this release. The most notable is probably the ability to create structured (or "tagged") PDF files. The code supports using both the builtin tags as well as defining your own. Feel free to try it, just know that the API is guaranteed to change.

As a visual example, here is the full Python source code for one of the unit tests.

When run it creates a tagged PDF file.  Adobe Acrobat reports that the document has the following logical structure.

As you can (hopefully) tell, structure and content are the same in both of them.