Tuesday, June 14, 2022

Attempting to create an aesthetic global line breaking algorithm

The Knuth-Plass line breaking algorithm is one of the cornerstones of TeX and why its output looks so pleasing to read (even to people who do not like the look of Computer Modern). While most text editors do line breaking with a quick & dirty algorithm that looks at each line in isolation, TeX does something fancier called minimum raggedness. The basic algorithm defines a global metric over the entire chapter and then chooses line breaks that minimize it. The basic function is the following:

For each line measure the difference between the desired width and the actual width and square the value. Then add these values together.

As you can easily tell, line breaks made at the beginning of the chapter affect the potential line breaks you can do later. Sometimes it is worth it to make a locally non-optimal choice at the beginning to get a better line break possibility much later. Evaluating a global metric like this can be potentially slow, which is why interactive programs like LibreOffice do not use this method.

The classical way of solving this problem is to use dynamic programming. It has the requirement that the problem must conform to a requirement called the Bellman optimality condition (or, if you are into rocketry, the Pontryagin maximum principle). This is perhaps best illustrated with an example: suppose you are in Paris and want to drive to Venice. This requires picking some path to drive that is "optimal" for your requirements. Now suppose we know that Zürich is along the path of this optimal route. The requirement basically says, then, that the optimal route you take from Paris to Zürich does not in any way affect the optimal route from Zürich to Venice. That is, the two paths can be routed independently of each other. This is true for the basic form of Knuth-Plass line breaking.

It is not true for line breaking in practice.

As an example there is an aesthetic requirement that there should not be three or more consecutive lines that end with a hyphen. Suppose you have split the problem in two and that in the top part the last two lines end with a dash and that the first line of the bottom part also ends with a dash. Each of the two parts is optimal in isolation but when combined they'd get the additional penalty of three consecutive hyphens and thus said solution might not be globally optimal.

So then what?

Computers today are a fair bit faster than in the late 70s/early 80s when TeX was developed. The problem size is also fairly small, the average text chapter only contains a few dozen lines (unless you are James Joyce). This leads to the obvious question of "couldn't you just work harder rather than smarter and try all the options?" Sadly the deities of combinatorics say you can't. There are just too many possibilities.

If you are a bit smarter about it, though, you can get most of the way there. For any given point in the raw text there are reasonably only a few places where you could place the optimal line break since every line must be "fairly smooth". The main split point is the one "closest" to the chapter width and then you can try one or two potential split points around it. These choices can be examined recursively fairly easily. So this is what I implemented as a test.

It even worked fairly well for a small sample text and created a good looking set of line breaks in a fraction of a second. Then I tried it with a different sample text that was about twice as long. The program then froze taking 100% CPU and producing no results. Foiled by algorithmic complexity once again!

After a bunch of optimizations what eventually ended up working was to store, for each split point, the N paths with the smallest penalties up to that point. Every time we enter that point the penalty of the current path is evaluated and compared to the list. If the penalty is larger than the worst option then search is abandoned. The resulting algorithm is surprisingly fast and could possibly even be used in real time.

The GUI app

Ideally you'd want to have tests for this functionality. This is tricky, since there is no golden correct answer, only what "looks good". Thus I wrote an application that can be used to examine the behaviour of the program with different texts, fonts and other parameters.

On the left you have the raw editable text, the middle shows how it would get rendered and on the right are the various statistics and parameters to twiddle. If we run the optimization on this piece of text the result looks like this:

For comparison here's what it looks like in LibreOffice:

And in Scribus:

No sample picture of TeX provided because I have neither the patience nor the skills to find out how to make it use Gentium.

While the parameters are not exactly the same in all three cases, we can clearly see that the new implementation produces more uniform results than the existing ones. One thing to note is that in some cases the new method creates lines that are a bit wider than the target box, where the other two never do. This causes the lines to be squished when justified and it looks really bad if done even a bit too much. The optimization function would probably need to be changed to penalize wide lines more than narrow ones.

The code

Get it here. It uses Gtk 4 and a bunch of related tech so getting it to work on anything else than Linux is probably a challenge.

There are a bunch of optimizations one could do, for example optical margin alignment or stretching individual letters on lines that fall far from the target width.

Thanks to our sponsor

This blog post was brought to you in part by two weeks of sick leave due to a dislocated shoulder. Special thanks to the paramedics on call and the fentanyl they administered to me.

Tuesday, June 7, 2022

Creating your own math-themed jigsaw puzzle from scratch

Don't you just hate it when you get nerd sniped?

I don't either. It is usually quite fun. Case in point, some time ago I came upon this YouTube video:

It is about how a "500 piece puzzle" usually does not have 500 pieces, but instead slightly more to make manufacturing easier (see the video for the actual details, they are actually quite interesting). As I was watching the video I came up with an idea for my own math-themed jigsaw puzzle.

You can probably guess where this is going.

The idea would not leave me alone so I had to yield to temptation and get the damn thing implemented. This is where problems started. The puzzle required special handling and tighter tolerances than the average jigsaw puzzle made from a custom photo. As a taste of things to come, the final puzzle will only have two kind of pieces, namely these:

For those who already deciphered what the final result will look like: good job.

As you can probably tell, the printed pattern must be aligned very tightly to the cut lines. If it shifts by even a couple of millimeters, which is common in printing, then the whole thing breaks apart. Another requirement is that I must know the exact piece count beforehand so that I can generate an output image that matches the puzzle cut.

I approached several custom jigsaw puzzle manufacturers and they all told me that what I wanted was impossible and that their manufacturing processes are not capable of such precision. One went so far as to tell me that their print tolerances are corporate trade secrets and so is the cut used. Yes, the cut. Meaning the shapes of the resulting pieces. The one feature that is the same on every custom jigsaw puzzle and thus is known by anyone who has ever bought one of them. That is a trade secret. No, it makes no sense to me either.

Regardless it seemed like the puzzle could not be created. But, as the old saying goes, all problems are solvable with a sufficient application of public libraries and lasers.

This is a 50 Watt laser cutter and engraver that is freely usable in my local library. This nicely steps around the registration issues because printing and cutting are done at the same time and the machine is actually incredibly accurate (sub-millimeter). The downside is that you can't use color in the image. Color is created by burning so you can only create grayscale images and the shade is not particularly precise, though the shapes are very accurate.

After figuring this out the procedure got simple. All that was needed was some Python, Cairo and 3mm plywood. Here is the machine doing the engraving.

After the image had been burned, it was time to turn the laser to FULL POWER and cut the pieces. First sideways

then lengthwise.

And here is the final result all assembled up.

This is a 256 piece puzzle showing a Hilbert Curve. It is a space filling curve, that is, it travels through each "pixel" in the image exactly once in a continuous fashion and never intersects itself. As you can (hopefully) tell, there is also a gradient so that the further along the curve you get the lighter the printing gets. So in theory you could assemble this jigsaw puzzle by first ordering the pieces from darkest to lightest and then just joining the pieces one after the other.

The piece cut in this puzzle is custom. The "knob" shape is parameterized by a bunch of variables and each cut between two pieces has been generated by picking random values for said parameters. So in theory you could generate an arbitrarily large jigsaw puzzle with this method (it does need to be a square with the side length being a power of two, though).