Friday, July 21, 2023

Creating a PDF/X-3 document with CapyPDF

The original motivation for creating CapyPDF was understanding how fully color managed and print-ready PDF files are actually put together. A reasonable measure of this would be being able to generate fully spec conforming PDF/X-3 files. Most of the building blocks were already there so this was mostly a question of exposing all that in Python and adding the missing bits.

I have now done this and it seems to work. The script itself can be found in CapyPDF's Git repo and the generated PDF document can be downloaded using this link. It looks like this in Acrobat Pro.

It is not graphically the most flashy, but it requires quite a lot of functionality:

  • The document is A4 size but it is defined on a larger canvas. After printing it needs to be cut into the final shape. This is needed as the document has a bleed.
  • To obtain this the PDF has a bleed box (the blue box) as well as a trim box (the green box). These are not painted in the document itself, Acrobat Pro just displays them for visualisation purposes.
  • All colors in graphical operations are specified in CMYK.
  • The image is a color managed CMYK TIFF. It has an embedded ICC profile that is different from the one used in final printing. This is due to laziness as I had this image laying around already. You don't want to do this in a real print job.
  • The heading has both a stroke and a fill
  • The printer marks are just something I threw on the page at random.
Here is a screenshot of the document passing PDF-X/3 validation in Acrobat Pro.

Sunday, July 16, 2023

PDF transparency groups and composition

The PDF specification has the following image as an example of how to do transparent graphics composition.

This seems simple but actually requires quite a lot of functionality:

  • Specifying CMYK gradients
  • Setting the blend mode for paint operations
  • Specifying transparency group xobjects
  • Specifying layer composition parameters
I had to implement a bunch of new functionality to get this working, but here is the same image reproduced with CapyPDF. (output PDF here)

The only difference here is that out of laziness I used a simple two color linear gradient rather than a rainbow one.


This is again one of those features that only Acrobat Reader can handle. I tried Okular, Ghostscript, Firefox, Chromium and Apple Preview and they all rendered the file incorrectly. There was no consistency, each one was broken in a different way.

Wednesday, July 5, 2023

CapyPDF 0.4 release and presenter tool

I have just released version 0.4 of CapyPDF. You can get it either via Github or PyPI. The target of this release was to be able to create a pure Python script that can be used to generate PDF slides to be used in presentations. It does not read any input, just always produces the same hardcoded output, but since the point of this was to create a working backend, that does not really matter.

  • The source code for the script is here. It is roughly 200 lines long.
  • The PDF it creates can be accessed here.
  • A screenshot showing all pages looks like the following.

What the screenshot does not tell, however, is that the file uses PDF transition effects. They are used both for between-page transitions as well as within-page transitions, specifically for the bullet points on page 2. This is, as far as I can tell, the first time within-page navigation functionality has been implemented in an open source package, possibly even the first time ever outside of Adobe (dunno if MS Office exports transitions, I don't have it so can't test it). As you can probably tell this took a fair bit of black box debugging because all PDF tools and validators simply ignore transition dictionaries. For example transcoding PDFs with Ghostscript scrubs all transitions from the output.

The only way to see the transitions correctly is to open the PDF in Acrobat Reader and then enable presenter mode. PDF viewer developers who want to add support for presentation effects might want to use that as an example document. I don't guarantee that it's correct, though, only that it makes Acrobat Reader display transition effects.