Wednesday, February 1, 2023

PDF with font subsetting and a look in the future

After several days of head scratching, debugging and despair I finally got font subsetting working in PDF. The text renders correctly in Okular, goes througg Ghostscript without errors and even passes an online PDF validator I found. But not Acrobat Reader, which chokes on it completely and refuses to show anything. Sigh.

The most likely cause is that the subset font that gets generated during this operation is not 100% valid. The approach I use is almost identical to what LO does, but for some reason their PDFs work. Opening both files in FontForge seems to indicate that the .notdef glyph definition is somehow not correct, but offers no help as to why.

In any case it seems like there would be a need for a library for PDF generation. Existing libs either do not handle non-RGB color spaces or are implemented in Java, Ruby or other languages that are hard to use from languages other than themselves. Many programs, like LO and Scribus, have their own libraries for generating PDF files. It would be nice if there could be a single library for that.

Is this a reasonable idea that people would actually be interested in? I don't know, but let's try to find out. I'm going to spend the next weekend in FOSDEM. So if you are going too and are interested in PDF generation, write a comment below or send me an email, I guess? Maybe we can have a shadow meeting in the cafeteria.

1 comment:

  1. I'm not at FOSDEM. I think it would make sense to factor out LibreOffice PDF export code, so it is usable in other projects too. It's contained mainly in pdfwriter_impl.hxx and pdfwriter_impl(2).cxx.
    It should be possible to factor out what is PDF writing and what is LO adaptation/type conversion code, so this would be a good first step. Then push that code out of LO and adapt LO to use it. Defiantly not easy to do but in the end the lib would have a very important user and contributor out-of-the-box.