PDF zopfli recompressor

The PDF format is one that I have developed a deep dislike for. The container is poorly designed, and subject to an excessive number of cumbersome revisions. It has serious accessibility issues, and is barely usable on mobile devices due to the inability to reflow. But, it is still in widespread use, so I have created a program to made PDF files smaller. It achieves this by identifying objects within the PDF that are compressed using DEFLATE, then attempting to recompress them using the Zopfli compressor. It tends to work best on image-heavy PDFs - one of my more successful test files was an art textbook, which went from 18MB to 8MB. Most files achieve a much more modest saving.

There is one catch: Because I needed to work with PDF files at a very low level, I couldn't use a library for this. I had to write the PDF code, and the PDF format is one of the worst that I have ever encountered. It was created in the 90s, and has since been hacked and rewritten six times to add additional functionality - some of those times by wholesale rewrite of critical structures. It's ugly - really, really ugly. Because of the difficulty in parsing such a complex and unpleasant format, the PDF Zopfli utility is not reliable. Sometimes it works. Sometimes it crashes. Sometimes it works, but the output files are corrupted.

The solution is to run the associated wrapper script which uses qpdf to prepare the input before Zopfli compression and to validate the output file afterwards - and even then, it's still important to check the file afterwards to make sure it opens. If the first page displays correctly, the rest of the file should be good too.

Download and use at your own risk. If you want to make this work properly, find a programmer with the time to rewrite major parts of the PDF reading code. This programmer will hate you afterwards.