Friday, February 27, 2026

Discovering a new class of primes fur the fun of it

There are a lot of prime classes, such as left truncating primes, twin primes, mersenne primes, palindromic primes, emirp primes and so on. The Wikipedia page on primes lists many more. Recently I got to thinking (as one is wont to do) how difficult would it be to come up with a brand new one. The only reliable way to know is to try it yourself.

The basic loop

The method I used was fairly straightforward:

  1. Download a list of the first one million primes
  2. Look at it
  3. Try to come up with a pattern
  4. Check if numbers from your pattern show up on OEIS
  5. Find out they are not
  6. Rejoice
  7. Check again more rigorously
  8. Realize they are in fact there in a slightly different form
  9. Go to 2
Eventually I managed to come up with a prime category that is not in OEIS. Python code that generates them can be found in this repo. It may have bugs (I discovered several in the course of writing this post). The data below has not been independently validated.

Faro primes

In magic terminology, a Faro shuffle is one that cuts a deck of cards in half and then interleaves the results. It is also known as a perfect shuffle. There are two different types of Faro shuffle, an in shuffle and an out shuffle. They have the peculiar property that if you keep repeating the same operation, eventually the deck returns to the original order.

A prime p is a Faro prime if all numbers obtained by applying Faro shuffles (either in or out shuffles, but only one type) to its decimal representation are also prime. A Faro prime can be an Faro in prime, a Faro out prime or both. As an example, 19 is a Faro in prime, because a single in shuffle returns it to its original form. It is not an Faro out prime, because out shuffling it produces 91, which is not a prime (91 = 7*13).

The testing for this was not rigorous, but at least OEIS does not recognize it.

Statistics

I only used primes with an even number of digits. For odd number of digits you'd first need to decide how in and out shuffles should work. This is left as an exercise to the reader.

Within the first one milllion primes, there are 7492 in primes, 775 out primes and 38 that are both in and out primes.

The numbers with one or two digits are not particularly interesting. The first "actual" Faro in prime is 1103. It can be in shuffled once yielding 1013.

For the first out shuffle you need to go to 111533, which shuffles to 513131 and 153113.

The first prime longer than 2 digits that qualifies for both a Faro in and out prime is 151673. Its in shuffle primes are 165713, 176153 and 117563. The corresponding out shuffle primes are 151673, 617531 and 563117.

Within the first one million primes the largest in shuffle prime is 15484627, the largest out shuffle prime is 11911111 and the largest in and out prime is 987793.

Further questions

As is typical in maths, finding out something immediately raises more questions. For example:

Why are there so many fewer out primes than in primes?

How would this look for primes with odd number of digits in them?

Is it possible to build primes by a mixture of in and out shuffles?

Most of the primes do not complete a "full shuffle", that is, they repeat faster than a deck of fully unique playing cards would. For any number n can you find a Faro prime that requires that many shuffles or is there an upper limit for the number of shuffles?

Saturday, February 14, 2026

What's cooking with Pystd, the experimental C++ standard library?

Pystd is an experiment on what a C++ standard library without any backwards compatibility requirements would look like. Its design goals are in order of decreasing priority:

  • Fast build times
  • Simplicity of implementation
  • Good performance
 It also has some design-antigoals:

  • Not compatible with the ISO C++ standard library
  • No support for weird corner cases like linked lists or types that can't be noexcept-moved
  • Do not reinvent things that are already in the C standard library (though you might provide a nicer UI to them)

Current status

There is a bunch of stuff implemented, like vector, several string types, hashmap, a B-tree based ordered map, regular expressions, unix path manipulation operations and so on. The latest addition has been sort algorithms, which include merge sort, heap sort and introsort.

None of these is "production quality". They will almost certainly have bugs. Don't rely on them for "real work". 

The actual library consists of approximately 4800 lines of headers and 4700 lines of source. Building the library and all test code on a Raspberry Pi using a single core takes 13 seconds. With 30 process invocations this means approximately 0.4 seconds per compilation.

For real world testing we have really only one data point, but in it build time was reduced by three quarters, the binary became smaller and the end result ran faster.

Portability

The code has been tested on Linux x86_64 and aarch64 as well as on macOS. It currently does not work with Visual Studio which has not implemented support for pack indexing yet.

Why should you consider using it?

Back in the 90s and 00s (I think) it was fashionable to write your own C++ standard library implementation. Eventually they all died and people moved to the one that comes with their compiler. Which is totally reasonable. So why would you now switch to something else?

For existing C++ applications you probably don't want to. The amount of work needed for a port is too much to be justified in most cases.

For green field projects things are more interesting. Maybe you just want to try something new just for the fun of it? That is the main reason why Pystd even exists, I wanted to try implementing the core building blocks of a standard library from scratch.

Maybe you want to provide "Go style" binaries that build fast and have no external deps? The size overhead of Pystd is only a few hundred k and the executables it yields only depend on libc (unless you use regexes, in which case they also depend on libpcre, but you can static link it if you prefer).

Resource constrained or embedded systems might also be an option. Libstdc++ takes a few megabytes. Pystd does require malloc, though (more specifically it requires aligned alloc) so for the smallest embedded targets you'd need to use something like the freestanding library. As an additional feature Pystd permits you to disable parts of the library that are not used (currently only regexes, but could be extended to things like threading and file system).

Compiler implementers might choose to test their performance with an unusual code base. For example GCC compiles most Pystd files in a flash but for some reason the B-tree implementation takes several seconds to build. I don't really know why because it does not do any heavy duty metaprogramming or such.

It might also be usable in teaching as a fairly small implementation of the core algorithms used today. Assuming anyone does education any more as opposed to relying on LLMs for everything.


Saturday, February 7, 2026

C and C++ dependencies, don't dream it, be it!

Bill Hoffman, the original creator of the CMake language held a presentation at CppCon. At approximately 49 minutes in he starts talking about future plans for dependency management. He says, and I now quote him directly, that "in this future I envision", one should be able to do something like the following (paraphrasing).

Your project has dependencies A, B and C. Typically you get them from "the system" or a package manager. But then you'd need to develop one of the deps as well. So it would be nice if you could somehow download, say, A, build it as part of your own project and, once you are done, switch back to the system one.

Well mr Hoffman, do I have wonderful news for you! You don't need to treasure these sensual daydreams any more. This so called "future" you "envision" is not only the present, but in fact ancient past. This method of dependency management has existed in Meson for so long I don't even remember when it got added. Something like over five years at least. 

How would you use such a wild and an untamed thing?

Let's assume you have a Meson project that is using some dependency called bob. The current build is using it from the system (typically via pkg-config, but the exact method is irrelevant). In order to build the source natively, first you need to obtain it. Assuming it is available in WrapDB, all you need to do is run this command:

meson wrap install bob

If it is not, then you need to do some more work. You can even tell Meson to check out the project's Git repo and build against current trunk if you so prefer. See documentation for details.

Then you need to tell Meson to use the internal one. There is a global option to switch all dependencies to be local, but in this case we want only this dependency to be built and get the remaining ones from the system. Meson has a builtin option for exactly this:

meson configure builddir -Dforce_fallback_for=bob

Starting a build would now reconfigure the system to use the builtin option. Once you are done and want to go back to using system deps, run this command:

meson configure builddir -Dforce_fallback_for=

This is all you need to do. That is the main advantage of competently designed tools. They rose tint your world and keep you safe from trouble and pain. Sometimes you can see the blue sky through the tears in your eyes.

Oh, just one more deadly sting

If you keep watching the presenter first asks the audience if this is something they would like. Upon receiving a positive answer he then follows up with this [again quoting directly]:

So you should all complain to the DOE [presumably US Department of Energy] for not funding the SBIR [presumably some sort of grant or tender] for this.

Shaming your end users into advocating an authoritarian/fascist government to give large sums of money in a tender that only one for-profit corporation can reasonably win is certainly a plan.

Instead of working on this kind of a muscle man you can alternatively do what we in the Meson project did: JFDI. The entire functionality was implemented by maybe 3 to 5 people, some working part time but most being volunteers. The total amount of work it took is probably a fraction of the clerical work needed to deal with all the red tape that comes with a DoE tender process.

In the interest of full disclosure

While writing this blog post I discovered a corner case bug in our current implementation. At the time of writing it is only seven hours old, and not particularly beautiful to behold as it has not been fixed yet. And, unfortunately, the only thing I've come to trust is that bugfixes take longer than you would want them to.