Saturday, February 14, 2026

What's cooking with Pystd, the experimental C++ standard library?

Pystd is an experiment on what a C++ standard library without any backwards compatibility requirements would look like. It's design goals are in order of decreasing priority:

  • Fast build times
  • Simplicity of implementation
  • Good performance
 It also has some design-antigoals:

  • Not compatible with the ISO C++ standard library
  • No support for weird corner cases like linked lists or types that can't be noexcept-moved
  • Do not reinvent things that are already in the C standard library (though you might provide a nicer UI to them)

Current status

There is a bunch of stuff implemented, like vector, several string types, hashmap, a B-tree based ordered map, regular expressions, unix path manipulation operations and so on. The latest addition has been sort algorithms, which include merge sort, heap sort and introsort.

None of these is "production quality". They will almost certainly have bugs. Don't rely on them for "real work". 

The actual library consists of approximately 4800 lines of headers and 4700 lines of source. Building the library and all test code on a Raspberry Pi using a single core takes 13 seconds. With 30 process invocations this means approximately 0.4 seconds per compilation.

For real world testing we have really only one data point, but in it build time was reduced by three quarters, the binary became smaller and the end result ran faster.

Portability

The code has been tested on Linux x86_64 and aarch64 as well as on macOS. It currently does not work with Visual Studio which has not implemented support for pack indexing yet.

Why should you consider using it?

Back in the 90s and 00s (I think) it was fashionable to write your own C++ standard library implementation. Eventually they have all died and people have moved to the one that comes with their compiler. Which is totally reasonable. So why would you now switch to something else?

For existing C++ applications you probably don't want to. The amount of work needed for a port is too much to be justified in most cases.

For green field projects things are more interesting. Maybe you just want to try something new just for the fun of it? That is the main reason why Pystd even exists, I wanted to try implementing the core building blocks of a standard library from scratch.

Maybe you want to provide "Go style" binaries that build fast and have no external deps? The size overhead of Pystd is only a few hundred k and they only depend on libc (unless you use regexes, in which case they depend on libpcre, but you can static link it if you prefer).

Resource constrained or embedded systems might also be an option. Libstdc++ takes a few megabytes. Pystd does require malloc, though (more specifically it requires aligned alloc) so for the smallest embedded targets you'd need to use something like the freestanding library. As an additional feature Pystd permits you to disable parts of the library that are not used (currently only regexes, but could be extended to things like threading and file system).

Compiler implementers might choose to test their performance with an unusual code base. For example GCC compiles most Pystd files in a flash but for some reason the B-tree implementation takes several seconds to build. I don't really know why because it does not do any heavy duty metaprogramming or such.

It might also be usable in teaching as a fairly small implementation of the core algorithms used today. Assuming anyone does education any more as opposed to relying on LLMs for everything.


Saturday, February 7, 2026

C and C++ dependencies, don't dream it, be it!

Bill Hoffman, the original creator of the CMake language held a presentation at CppCon. At approximately 49 minutes in he starts talking about future plans for dependency management. He says, and I now quote him directly, that "in this future I envision", one should be able to do something like the following (paraphrasing).

Your project has dependencies A, B and C. Typically you get them from "the system" or a package manager. But then you'd need to develop one of the deps as well. So it would be nice if you could somehow download, say, A, build it as part of your own project and, once you are done, switch back to the system one.

Well mr Hoffman, do I have wonderful news for you! You don't need to treasure these sensual daydreams any more. This so called "future" you "envision" is not only the present, but in fact ancient past. This method of dependency management has existed in Meson for so long I don't even remember when it got added. Something like over five years at least. 

How would you use such a wild and an untamed thing?

Let's assume you have a Meson project that is using some dependency called bob. The current build is using it from the system (typically via pkg-config, but the exact method is irrelevant). In order to build the source natively, first you need to obtain it. Assuming it is available in WrapDB, all you need to do is run this command:

meson wrap install bob

If it is not, then you need to do some more work. You can even tell Meson to check out the project's Git repo and build against current trunk if you so prefer. See documentation for details.

Then you need to tell Meson to use the internal one. There is a global option to switch all dependencies to be local, but in this case we want only this dependency to be built and get the remaining ones from the system. Meson has a builtin option for exactly this:

meson configure builddir -Dforce_fallback_for=bob

Starting a build would now reconfigure the system to use the builtin option. Once you are done and want to go back to using system deps, run this command:

meson configure builddir -Dforce_fallback_for=

This is all you need to do. That is the main advantage of competently designed tools. They rose tint your world and keep you safe from trouble and pain. Sometimes you can see the blue sky through the tears in your eyes.

Oh, just one more deadly sting

If you keep watching the presenter first asks the audience if this is something they would like. Upon receiving a positive answer he then follows up with this [again quoting directly]:

So you should all complain to the DOE [presumably US Department of Energy] for not funding the SBIR [presumably some sort of grant or tender] for this.

Shaming your end users into advocating an authoritarian/fascist government to give large sums of money in a tender that only one for-profit corporation can reasonably win is certainly a plan.

Instead of working on this kind of a muscle man you can alternatively do what we in the Meson project did: JFDI. The entire functionality was implemented by maybe 3 to 5 people, some working part time but most being volunteers. The total amount of work it took is probably a fraction of the clerical work needed to deal with all the red tape that comes with a DoE tender process.

In the interest of full disclosure

While writing this blog post I discovered a corner case bug in our current implementation. At the time of writing it is only seven hours old, and not particularly beautiful to behold as it has not been fixed yet. And, unfortunately, the only thing I've come to trust is that bugfixes take longer than you would want them to.

Tuesday, January 13, 2026

How to get banned from Facebook in one simple step

I, too, have (or as you can probably guess from the title of this post, had) a Facebook account. I only ever used it for two purposes.

  1. Finding out what friends I rarely see are doing
  2. Getting invites to events
Facebook has over the years made usage #1 pretty much impossible. My feed contains approximately 1% posts by my friends and 99% ads for image meme "humor" groups whose expected amusement value seems to be approximately the same as punching yourself in the groin.

Still, every now and then I get a glimpse of a post by the people I actively chose to follow. Specifically a friend was pondering about the behaviour of people who do happy birthday posts on profiles of deceased people. Like, if you have not kept up with someone enough to know that they are dead, why would you feel the need to post congratulations on their profile pages.

I wrote a reply which is replicated below. It is not accurate as it is a translation and I no longer have access to the original post.

Some of these might come via recommendations by AI assistants. Maybe in the future AI bots from people who themselves are dead carry on posting birthday congratulations on profiles of other dead people. A sort of a social media for the deceased, if you will.

Roughly one minute later my account was suspended. Let that be a lesson to you all. Do not mention the Dead Internet Theory, for doing so threatens Facebook's ad revenue and is thus taboo. (A more probable explanation is that using the word "death" is prohibited by itself regardless of context, leading to idiotic phrasing in the style of "Person X was born on [date] and d!ed [other date]" that you see all over IG, FB and YT nowadays.)

Apparently to reactivate the account I would need to prove that "[I am] a human being". That might be a tall order given that there are days when I doubt that myself.

The reactivation service is designed in the usual deceptive way where it does not tell you all the things you need to do in advance. Instead it bounces you from one task to another in the hopes that sunk cost fallacy makes you submit to ever more egregious demands. I got out when they demanded a full video selfie where I look around different directions. You can make up your own theories as to why Meta, a known advocate for generative AI and all that garbage, would want a high resolution scans of people's faces. I mean, surely they would not use it for AI training without paying a single cent for usage rights to the original model. Right? Right?

The suspension email ends with this ultimatum.

If you think we suspended your account by mistake, you have 180 days to appeal our decision. If you miss this deadline your account will be permanently disabled.

Well, mr Zuckerberg, my response is the following:

Close it! Delete it! Burn it down to the ground! I'd do it myself this very moment, but I can't delete the account without reactivating it first.

Let it also be noted that this post is a much better way of proving that I am a human being than some video selfie thing that could be trivially faked with genAI.

Friday, January 9, 2026

AI and money

If you ask people why they are using AI (or want other people to use it) you get a ton of different answers. Typically none of them contain the real reason, which is that using AI is dirt cheap. Between paying a fair amount to get something done and paying very little to give off an impression that the work has been done, the latter tends to win.

The reason AI is so cheap is that it is being paid by investors. And the one thing we know for certain about those kinds of people is that they expect to get their money back. Multiple times over. This might get done by selling the system to a bigger fool before it collapses, but eventually someone will have to earn that money back from actual customers (or from government bailouts, i.e. tax payers).

I'm not an economist and took a grand total of one economics class in the university, most of which I have forgotten. Still, using just that knowledge we can get a rough estimate of the money flows involved. For simplicity let's bundle all AI companies to a single entity and assume a business model based on flat monthly fees.

The total investment

A number that has been floated around is that AI companies have invested approximately one trillion (one thousand billion or 1e12) dollars. Let's use that as the base investment we want to recover.

Number of customers

Sticking with round figures, let's assume that AI usage becomes ubiquitous and that there are one billion monthly subscribers. For comparison the estimated number of current Netflix subscribers is 300 million.

Income and expenses

This one is really hard to estimate. What seems to be the case is that current monthly fees are not enough to even pay back the electricity costs of providing the service. But let's again be generous and assume that some sort of a efficiency breakthrough happens in the future and that the monthly fee is $20 with expenses being $10. This means a $10 profit per user per month.

We ignore one-off costs such as buying several data centers' worth of GPUs every few years to replace the old ones.

The simple computation

With these figures you get $10 billion per month or $120 billion per year. Thus paying off the investment would take a bit more than 8 years. I don't personally know any venture capitalists, but based on random guessing this might fall in the "takes too long, but just about tolerable" level of delay.

So all good then?

Not so fast!

One thing to keep in mind when doing investment payback calculations is the time value of money. Money you get in "the future" is not as valuable as money you have right now. Thus we need to discount them to current value.

Interest rate

I have no idea what a reasonable discount rate for this would be. So let's pick a round number of 5.

The "real-er" numbers

At this point the computations become complex enough that you need to break out the big guns. Yes, spreadsheets.

Here we see that it actually takes 12 years to earn back the investment. Doubling the investment to two trillion would take 36 years. That is a fair bit of time for someone else to create a different system that performs maybe 70% as well but which costs a fraction of the old systems to get running and operate. By which time they can drive the price so low that established players can't even earn their operating expenses let alone pay back the original investment. 

Exercises for the reader

  • This computation assumes the system to have one billion subscribers from day one. How much longer does it take to recuperate the investment if it takes 5 years to reach that many subscribers? What about 10 years?
  • How long is the payback period if you have a mere 500 million paid subscribers?
  • Your boss is concerned about the long payback period and wants to shorten it by increasing the monthly fee. Estimate how many people would stop using the service and its effect on the payback time if the fee is raised from $20 to $50. How about $100? Or $1000?
  • What happens when the ad revenue you can obtain by dumping tons of AI slop on the Internet falls below the cost of producing said slop?

Sunday, January 4, 2026

Converting Chapterizer from Cairo + Pango to CapyPDF

Chapterizer (not a great name, I know) is a tool I wrote to generate books. Originally used Cairo and Pango to generate PDF files. It works and was fairly easy to get started but has its own set of downsides:

  • Cairo always produces RGB PDFs, which are not accepted by printing houses
  • Cairo does not handle advanced PDF features like trim boxes
  • Pango aligns text at the top of each line, but for high quality text output you have to do baseline alignment
  • Pango is designed to "always print something", which is to say it does transparent font substitution for example when the chosen font does not have some glyph
I have also created CapyPDF to generate "proper" PDF. Over the holidays I finalized porting Chapterizer to use CapyPDF. The pipeline is now surprisingly simple. First you read in the source text, then it is shaped with Harfbuzz and then written to a PDF file with CapyPDF.

It was grunt work. Nothing about it was particularly difficult, just dealing with the same old issues like the fact that in PDF the page's origin is at bottom left, whereas in Cairo it is at the top left.

Anyhow, now that it is done we can actually test the performance of CapyPDF with a somewhat realistic setup. Currently creating a 40 page document takes 0.4 seconds which comes down to 0.01 seconds per page. Which is fast enough for me.

Friday, January 2, 2026

New year, new Pystd epoch, or evolving an API without breaking it

One of the core design points of Pystd has been that it maintains perfect API and ABI stability while also making it possible to improve the code in arbitrary ways. To see how that can be achieved, let's look at what creating a new "year epoch" looks like. It's quite simple. First you run this script

Then you add the new files to Meson build targets (I was too lazy to implement that in the script). Done. For extra points there is also a new test that mixes types of pystd2025 and pystd2026 just to verify that things work.

As everything is inside a yearly namespace (and macros have the corresponding prefix) the symbols do not clash with each other.

At this point in time pystd2025 is frozen so old apps (of which there are, to be honest, approximately zero) keep working forever. It won't get any new features, only bug fixes. Pystd2026, on the other hand, is free to make any changes it pleases as it has zero backwards compatibility guarantees.

Isn't code duplication terribly slow and inefficient?

It can be. Rather than handwaving about it, lets measure. I used my desktop computer which has an AMD Ryzen 7 3700X.

Compiling Pystd from scratch and running the test suite (with code for both 2025 and 2026) in both debug and optimized modes takes 3 seconds in total (1s for debug, 2s for optimized). This amounts to 2*13 compiler invocations, 2 static linker invocations and 2*5 dynamic linker invocations.

Compiling a helloworld with standard C++ using -O2 -g also takes 3 seconds. This amounts to a single compiler invocation.

Monday, December 22, 2025

An uncomfortable but necessary discussion about the Debian bug tracker

Note: this post represents my personal opinions as a Debian maintainer of a single package (Meson). It is not my intention to throw anyone involved in the service under a bus, but some things about it are not good and need to be spoken aloud (in my opinion anyway, other people may disagree and that is fine).

There was a post called Configuring a mail transfert [sic] agent to interact with the Debian bug tracker on Planet Debian. It contained the following statement:

using an email client to create or modify bug reports is not a bad idea per se

Indeed it is not. However, using an email client as the only way of modifying bugs (which is how the Debian bug tracker works) is not only a bad idea, it is terrible idea. To me managing bugs is so awful that it is actively pushing me away from contributing to Debian. The bug statuses on Meson are not kept up to date because I prefer that to having to deal with the bug tracker. I suspect I am not alone in this. In any case it is a major hurdle for new developers and might even cause some people to drop out entirely [1].

Why is it like this?

The Debian bug tracker was originally implemented in 1993 or thereabouts. Pretty much everything IT related was different back then. Manipulating things via email actually made sense at the time. Sadly, the world changed completely but volunteers working on the bug tracker did not have the resources to update it [2]. The end result is a classical legacy system: one that works and does the thing it needs to  do but which no new developer wants to touch.

Notable updates to the system would require major resources, which the project does not have.

FWICT there have been attempts to migrate the tracker to e.g. Bugzilla or Gitlab, but none of those has come even close to succeeding.

Why is the UX bad?

There is no web UI for manipulating bug data. Instead you write an email in a custom format, send that to a specific email address and wait for things to happen.

The main problem here is not the format as such, it is the fact that the user has to do the work of the computer:

  • Every time you need to manipulate bugs, you need to open the documentation page to remind yourself what the actual syntax is. A program would get it right automatically every time.
  • Get it wrong? Sucks to be you. A program would get it right automatically every time.
  • Send it to the wrong email address? Sucks to be you. And the person whose bug you just altered. A computer program would get it right automatically every time.
  • And so on.

I suspect most Debian developers who spend a lot of time on this have written their own custom scripts for their use cases. But having hundreds of ersatz tools for common tasks seems suboptimal.

As an extra cherry on the cake, the bug tracker will send you an email every! single! time! you edit or comment on any bug. Not only does the service waste your time by forcing you to write syntactically correct batch processing commands by hand, it also wastes it by forcing you to keep deleting spam [3].

How does security on the system work?

It doesn't. The email interface is 100% open. Anyone can edit any bug in any way just by sending a suitably crafted email to the control address [3]. If a 4chan script kiddie would want to screw up the entire Debian bug repository, they could do so fairly easily. 

There is an actual term for this approach: security through obscurity. The fact that the main bug tracker of the OS that runs the world does not have strict authentication in place does not fill me with warm fuzzies.

What would be a way forward?

A one-shot conversion to a different bug tracker is out of the question. Instead the situation could be improved incrementally, for example:

  1. Create a new web service that parses the existing bug data and displays it in a "rich" format.
  2. Update the UI so that registered users can change the state of the bug (close, duplicate, etc).
  3. Make the UI send a suitably formatted control email to the backend.
  4. Bless the new web service as an official way of editing bugs (hosted under debian.org and all that)
  5. Edit the backend service so it only accepts emails from the web UI and registered Debian developers and maintainers
  6. Change the backend to something else or improve it for the new UI [optional].
Steps 1 to 3 can be done by a single person or a small team. Since you (probably) don't have access to the backend service, you need to parse the bug state from the current tracker's status page (example here). That is a bit gnarly, but should be doable. If someone is looking for a personal project for the holiday season, this is something to consider.

Steps 4 to 6 would take months or years of full time work. Given that this approach was first suggested almost exactly 25 years ago, it is unlikely that resources for it would materialize out of thin air any time soon.

[1] My speculation.

[2] Also my speculation.

[3] You could use email filtering rules but that is again extra work for every user. A better option is not spamming (one thank-you email for new contributors is fine, more than that is not).

[4] Last time I checked at least. I have personally manipulated all sorts of bugs via email even before I was a Debian maintainer. No registration. No checks. Not even GPG signing.