Sunday, January 21, 2024

CapyPDF 0.8.0 released

Version 0.8.0 of the CapyPDF library has been released. The main new feature is support for form XObjects and printer's mark annotations.

Printer's marks are things like color bars, crop marks and registration marks (also known as "bullseye marks") that high end printers need for quality control. Here is a very simple document.

An experienced print operator would use the black lines to set up paper trimming. Traditionally these marks were drawn in the page's graphics stream. This is problematic because nowadays printers prefer to use their own custom marks instead of ones created by the document author. PDF solves this issue by moving these graphics operations to separate draw contexts (specifically "form XObjects", which are not actually forms, though they are XObjects) that can then be "glued" on top of the page. These annotations are shown in PDF viewer applications but they are not printed. I have no experience with high end RIP software, but presumably the operator can choose to either print the document's annotations or replace them with their custom QA marks.

As usual, to find out what features CapyPDF has and how to use them, look up either the public C header or the Python code used in unit tests.

Tuesday, January 2, 2024

C++ module tooling emulator playground

Developing tooling for C++ modules is challenging to say the least. Module implementation maturity in compilers varies, they all work slightly (well massively) differently, there are bugs and you also need a code base that uses modules. Because of these and other reasons there are maybe five people in the entire world who even think about this issue. This is bad, because it is supposed to be future foundational technology. It would benefit from more eyes.

Something ought to be done about this. So I did.

I created a fake "module only" C++ compiler, a fake linker, a fake module scanner and fake project generator. The compiler does not produce any code. It only does three things:

  1. Reads export statements from sources and writes corresponding fake module files
  2. Reads import statements from source files.
  3. Validates that all module files a source file imports exist on disk at the time of invocation
This did not take much effort. In fact the whole project is approximately 300 lines of Python and can be obtained from this repository.

With this it is possible for anyone interested to try to come up with a module building scheme that does not require generating O(N²) command line arguments on the fly. The current scanner implementation is taken almost directly from Meson and it works one build target at a time as opposed to a single source file at a time. With this approach the overhead of scanning is one process invocation per build target. A per-source approach would take two processes per source scanning: one for invoking the compiler to generate the standard JSON-format dependency file and another for converting that to the dyndep format that Ninja understands.

The setup assumes that the compiler supports a "write module files to this directory" command line argument. It is mandatory to avoid generating compiler arguments dynamically.

Or maybe it isn't and there is a way to make it work in some other way. At least now the tooling is available so anyone reading this can try to solve this problem.