Sunday, January 4, 2026

Converting Chapterizer from Cairo + Pango to CapyPDF

Chapterizer (not a great name, I know) is a tool I wrote to generate books. Originally used Cairo and Pango to generate PDF files. It works and was fairly easy to get started but has its own set of downsides:

  • Cairo always produces RGB PDFs, which are not accepted by printing houses
  • Cairo does not handle advanced PDF features like trim boxes
  • Pango aligns text at the top of each line, but for high quality text output you have to do baseline alignment
  • Pango is designed to "always print something", which is to say it does transparent font substitution for example when the chosen font does not have some glyph
I have also created CapyPDF to generate "proper" PDF. Over the holidays I finalized porting Chapterizer to use CapyPDF. The pipeline is now surprisingly simple. First you read in the source text, then it is shaped with Harfbuzz and then written to a PDF file with CapyPDF.

It was grunt work. Nothing about it was particularly difficult, just dealing with the same old issues like the fact that in PDF the page's origin is at bottom left, whereas in Cairo it is at the top left.

Anyhow, now that it is done we can actually test the performance of CapyPDF with a somewhat realistic setup. Currently creating a 40 page document takes 0.4 seconds which comes down to 0.01 seconds per page. Which is fast enough for me.

Friday, January 2, 2026

New year, new Pystd epoch, or evolving an API without breaking it

One of the core design points of Pystd has been that it maintains perfect API and ABI stability while also making it possible to improve the code in arbitrary ways. To see how that can be achieved, let's look at what creating a new "year epoch" looks like. It's quite simple. First you run this script

Then you add the new files to Meson build targets (I was too lazy to implement that in the script). Done. For extra points there is also a new test that mixes types of pystd2025 and pystd2026 just to verify that things work.

As everything is inside a yearly namespace (and macros have the corresponding prefix) the symbols do not clash with each other.

At this point in time pystd2025 is frozen so old apps (of which there are, to be honest, approximately zero) keep working forever. It won't get any new features, only bug fixes. Pystd2026, on the other hand, is free to make any changes it pleases as it has zero backwards compatibility guarantees.

Isn't code duplication terribly slow and inefficient?

It can be. Rather than handwaving about it, lets measure. I used my desktop computer which has an AMD Ryzen 7 3700X.

Compiling Pystd from scratch and running the test suite (with code for both 2025 and 2026) in both debug and optimized modes takes 3 seconds in total (1s for debug, 2s for optimized). This amounts to 2*13 compiler invocations, 2 static linker invocations and 2*5 dynamic linker invocations.

Compiling a helloworld with standard C++ using -O2 -g also takes 3 seconds. This amounts to a single compiler invocation.

Monday, December 22, 2025

An uncomfortable but necessary discussion about the Debian bug tracker

Note: this post represents my personal opinions as a Debian maintainer of a single package (Meson). It is not my intention to throw anyone involved in the service under a bus, but some things about it are not good and need to be spoken aloud (in my opinion anyway, other people may disagree and that is fine).

There was a post called Configuring a mail transfert [sic] agent to interact with the Debian bug tracker on Planet Debian. It contained the following statement:

using an email client to create or modify bug reports is not a bad idea per se

Indeed it is not. However, using an email client as the only way of modifying bugs (which is how the Debian bug tracker works) is not only a bad idea, it is terrible idea. To me managing bugs is so awful that it is actively pushing me away from contributing to Debian. The bug statuses on Meson are not kept up to date because I prefer that to having to deal with the bug tracker. I suspect I am not alone in this. In any case it is a major hurdle for new developers and might even cause some people to drop out entirely [1].

Why is it like this?

The Debian bug tracker was originally implemented in 1993 or thereabouts. Pretty much everything IT related was different back then. Manipulating things via email actually made sense at the time. Sadly, the world changed completely but volunteers working on the bug tracker did not have the resources to update it [2]. The end result is a classical legacy system: one that works and does the thing it needs to  do but which no new developer wants to touch.

Notable updates to the system would require major resources, which the project does not have.

FWICT there have been attempts to migrate the tracker to e.g. Bugzilla or Gitlab, but none of those has come even close to succeeding.

Why is the UX bad?

There is no web UI for manipulating bug data. Instead you write an email in a custom format, send that to a specific email address and wait for things to happen.

The main problem here is not the format as such, it is the fact that the user has to do the work of the computer:

  • Every time you need to manipulate bugs, you need to open the documentation page to remind yourself what the actual syntax is. A program would get it right automatically every time.
  • Get it wrong? Sucks to be you. A program would get it right automatically every time.
  • Send it to the wrong email address? Sucks to be you. And the person whose bug you just altered. A computer program would get it right automatically every time.
  • And so on.

I suspect most Debian developers who spend a lot of time on this have written their own custom scripts for their use cases. But having hundreds of ersatz tools for common tasks seems suboptimal.

As an extra cherry on the cake, the bug tracker will send you an email every! single! time! you edit or comment on any bug. Not only does the service waste your time by forcing you to write syntactically correct batch processing commands by hand, it also wastes it by forcing you to keep deleting spam [3].

How does security on the system work?

It doesn't. The email interface is 100% open. Anyone can edit any bug in any way just by sending a suitably crafted email to the control address [3]. If a 4chan script kiddie would want to screw up the entire Debian bug repository, they could do so fairly easily. 

There is an actual term for this approach: security through obscurity. The fact that the main bug tracker of the OS that runs the world does not have strict authentication in place does not fill me with warm fuzzies.

What would be a way forward?

A one-shot conversion to a different bug tracker is out of the question. Instead the situation could be improved incrementally, for example:

  1. Create a new web service that parses the existing bug data and displays it in a "rich" format.
  2. Update the UI so that registered users can change the state of the bug (close, duplicate, etc).
  3. Make the UI send a suitably formatted control email to the backend.
  4. Bless the new web service as an official way of editing bugs (hosted under debian.org and all that)
  5. Edit the backend service so it only accepts emails from the web UI and registered Debian developers and maintainers
  6. Change the backend to something else or improve it for the new UI [optional].
Steps 1 to 3 can be done by a single person or a small team. Since you (probably) don't have access to the backend service, you need to parse the bug state from the current tracker's status page (example here). That is a bit gnarly, but should be doable. If someone is looking for a personal project for the holiday season, this is something to consider.

Steps 4 to 6 would take months or years of full time work. Given that this approach was first suggested almost exactly 25 years ago, it is unlikely that resources for it would materialize out of thin air any time soon.

[1] My speculation.

[2] Also my speculation.

[3] You could use email filtering rules but that is again extra work for every user. A better option is not spamming (one thank-you email for new contributors is fine, more than that is not).

[4] Last time I checked at least. I have personally manipulated all sorts of bugs via email even before I was a Debian maintainer. No registration. No checks. Not even GPG signing.

Tuesday, December 16, 2025

An Appeal from the United Federation of Dictators, Despots, Evil Emperors and Tyrants

Truly, we are living in a new Golden Age for all those sharing our passion in subjugating all of human race under a single iron fist.

And to think that mere few decades ago we thought that our way of life was heading to the dung heap of humanity. Education, international cooperation and other such scourges of democracy and civilization seemed to have taken an unescapable stranglehold on our core values and, by extension, our future. But then, our savior appeared from nowhere in the form of His Holiness Steve Jobs. The vision and tireless unpaid overworking of his minions gave birth to the Squircle of Self-Subjugation and the world has never been the same since. May the glory of your achievements, oh Steve the Seer, shine forevermore throughout the four rounded corners of our planet!

Now, to be sure, many among our ranks were very sceptical of the product at first glance. Giving people the power to access all the information in the world wherever they may be seemed like the final blow to our cause. Yet, it became our greatest triumph. It did not take long to see that people who had formerly acted on injustices they perceived in the world switched to clicking on the thumbs up button on a Facebook post advocating some change and then promptly forgetting about it. Even this would have been master level brainwashing, but nowadays the unwashed masses do not feel the need to do even that. Instead they are watching an unending stream of five second nonsense videos that leave them incapable of thinking rationally. This is just like ye good olde roman times of Panem et circenses, except that bread does not need to be given out. People will buy it themselves at outrageous prices and then claim that giving out free bread to the starving supports terrorism. These spontaneous acts of hatred are what makes a tyrant's heart run aflutter with joy.

But that was only the beginning. Just as Steve gave us legs so we could run, Sam Altman gave us wings to take flight. His unsurpassed vision can not be praised highly enough. While the rest of us were trying to destroy democracy using the tried tools of our trade: war, forgery, drug trafficking, genocide, abolishing free speech (up to and including parody and satire), he went further than any of us could even dream of. He sought to destroy reality itself. Thus far he has been succeeding on every front and we salute him, for reality is highly problematic for us in the despot business. Reality affects people in weird ways. It warps them. It makes them want to do things they think are correct and to fix things they deem unjust instead of obeying our orders without questioning them. Because of this, reality must go.

And indeed it has. An ever growing number of people are now incapable of making any decisions on their own. Instead they ask a machine, a higher authority, what to do and follow the given advice to the letter. Even if it conflicts with the reality they see with their own eyes. The thing we had already deemed impossible is now routine. The knock-on effects of this generative AI technology should not be dismissed either. There are currently tens of millions of people working as our allies across all layers of western society to bring it crashing down. Rather than doing work, they just use generative AI to pretend to work. This gives them two unbeatable advantages. A) They don't have do anything, but instead can spend more time getting their ego stroked in social bubble media. B) They keep getting paid as if they were still working. True, these kinds of people have always existed, but, with a single swift stroke of Sam's silk-clad sophisticatorix, the do-not-give-a-shitters are now the majority. Those who care will either collapse under load or quit out of frustration. The collapse of democratic institutions is guaranteed. The only negative side is a lack of accomplishment: it was all too easy.

Unfortunately institutions can be rebuilt. Fortunately that has already been prevented. The use of AI to cheat has spread through the entire schooling system like wildfire. Teaching the next generation to avoid doing any work, taking responsibility or questioning authority is the most important thing we, the destroyers of free thinking, individuality and people's sense of self-worth, can do. For the first time since the invention of basic education, this issue is now in good hands. There are even people who seriously, and loudly, hold the opinion that school should be fundamentally altered. Instead of propagating knowledge or thinking, it should only train children to use ChatGPT. To this we say an unequivocal yes! The future shall have no revolutions, only blind obedience.

Alas, all is not well in this, our new utopia. In recent times many of us have noticed a marked decrease in our servants' work ethics. Whether it be incorrectly cooked eggs at breakfast to poorly cleaned offices, the lack of quality and finesse is palpable. What is even the point of manufacturing a massive statue of yourself on the capital's main square if the gilding is flaking off it during the inauguration ceremony? Or to build the world's biggest eight lane highway bridge only to have it collapse before a single car has crossed it as nobody actually cared to do the load bearing computations properly? The only way to get people to do anything nowadays is to threaten them with a firing squad. This toxic work environment brings about severe psychological stress. Contrary to common belief, tyrants are human too. Having to threaten ten different people with execution before lunch is exhaustive and may even cause mental health issues for the caring Great Leader. This is, of course, as unacceptable as it is inhumane.

Thus we come to the core of this declaration. While we feel that social media, AI, cell phones and other technological tools of totalitarianism are useful and mandatory for the modern dictator, their use has spread too far. They destroy too many of the things we hold dear. There are things too important to be left to the whims of plebs who neither care nor understand what they are supposed to be doing. Thus we will be specifying a list of occupations and tasks where the use of these tools shall be prohibited and we encourage all totalitarian countries to do the same. As dictatorships are not, by definition, under free market pressure to maximize profit regardless of consequences, we have the luxury of being able to invest sensibly. That is how we win. Of course we would prefer not to have to resort to these measures, but given the current geopolitical situation, it is necessary.

Finally, to not end on a downer, we would like to extend our thanks to our business partners. As you know, bending the entire humankind to the ground for an iron boot stomp on their face forever is not something you can do on your own. It is a team effort. We extend our most heartfelt gratitude to all our allies. Thank you Tim Cook, Mark Zuckerberg, Peter Thiel, Sundar Pichai, Jeff Bezos, Satya Nadella and all the rest. You are our Most Valuable Players. Without you, this would not have been possible. None of you were asked to do this. You stood up voluntarily and chose to take control of the ignorant masses, as any good despot should. Generations upon generations of children shall be marched to pay tribute to your portraits every morning of every month of every year. Which probably is, if we may be so bold as to speculate, the thing you wanted all along.

Yours sincerely in enslavement,

  • Idi Amin
  • Nicolae CeauČ™escu
  • Kim Il-Sung and family
  • Augusto Pinochet
  • Pol Pot
  • Ranavalona I
  • Josef Stalin
  • Mao Zedong

Monday, November 24, 2025

3D models in PDF documents

PDF can do a lot of things. One them is embedding 3D models in the file and displaying them. The user can orient them freely in 3D space and even choose how they should be rendered (wireframe, solid, etc). The main use case for this is engineering applications.

Supporting 3D annotations is, as expected, unexpectedly difficult because:

  1. No open source PDF viewer seems to support 3D models.
  2. Even though the format specification is available, no open source software seems to support generating files in this format (by which I mean Blender does not do it by default). [1]
But, again, given sufficient effort and submitting data to not-at-all-sketchy-looking 3D model conversion web sites, you can get 3D annotations to work. Almost.

As you can probably tell, the picture above is not a screenshot. I had to take it with a cell phone camera, because while Acrobat Reader can open the file and display the result, it hard crashes before you can open the Windows screenshot tool. 

[1] Update: apparently KiCad nightly can export U3D files that can be used in PDFs.

Thursday, November 13, 2025

Creating valid PDF/A-4 with CapyPDF

PDF/A is a specific version of PDF designed for long term archival of electronic data. The idea being that PDF/A files are both self contained and fully specified, so they can be opened in the future without any loss of fidelity.

Implementing PDF/A export is complicated by the fact that the specification is an ISO standard, which is not publicly available. Fortunately, there are PDF/A validators that will tell you if (and sometimes how) your generated PDF/A is invalid. So, given sufficient patience, you can keep throwing PDF files at the validator, fixing the issues reported and repeating this loop over and over until validation passes. Like this:

This will be available in the next release of CapyPDF.

Tuesday, October 21, 2025

CapyPDF 1.8.0 released

I have just released CapyPDF 1.8. It's mostly minor fixes and tweaks but there are two notable things. The first one is that CapyPDF now supports variable axis fonts. The other one is that CapyPDF will now produce PDF version 2.0 files instead of 1.7 by default. This might seem like a big leap but really isn't. PDF 2.0 is pretty much the same as 1.7, just with documentation updates and deprecating (but not removing) a bunch of things. People using PDF have a tendency to be quite conservative in their versions, but PDF 2.0 has been out since 2017 with most of it being PDF 1.7 from 2008.

It is still possible to create version with older PDF specs. If you specify, say, PDF/X3, CapyPDF will output PDF 1.3 as the spec requires that version and no other even though, for example, Adobe's PDF tools accept PDF/X3 whose version later than 1.3.

The PDF specification is currently undergoing major changes and future versions are expected to have backwards incompatible features such as HDR imaging. But 2.0 does not have those yet.

Things CapyPDF supports

CapyPDF has implemented a fair chunk of the various PDF specs:

  • All paint and text operations
  • Color management
  • Optional content groups
  • PDF/X and PDF/A support
  • Tagged PDF (i.e. document structure and semantic information)
  • TTF, OTF, TTC and CFF fonts
  • Forms (preliminary)
  • Annotations
  • File attachments
  • Outlines
  • Page naming
In theory this should be enough to support things like XRechnung and documents with full accessibility information as per PDF/UA. These have not been actually tested as I don't have personal experience in German electronic invoicing or document accessibility.