pdfsizeopt icon indicating copy to clipboard operation
pdfsizeopt copied to clipboard

Replace pngout with another image optimizer by default

Open pts opened this issue 6 years ago • 3 comments

The benefit is that after doing so, pdfsizeopt would depend only on open source and free software. pngout is the last non-free component.

Requirements for the replacement of pngout:

  • Open source and free software.
  • Supports any PNG as input and output.
  • Command-line flag to force grayscale output. This is needed for images referred to by the /SMask of other images.
  • Compressed image output not much larger than pngout. (Preferably the optimizer should do colorspace optimization, but it's not a requirement, because sam2p already does this.)
  • Compression speed not much slower than pngout.
  • Easy static compilation to Linux, macOS and Windows 32-bit i386 executable.
  • Statically compiled executable isn't much larger than pngout.

Image optimizers which can work as a replacement for pngout:

  • optipng. TODO: Measure speed and compression ratio.
  • none (--use-image-optimizer=none). This still runs sam2p.
  • jbig2 (--use-image-optimizer=jbig2). This still runs sam2p for all images, and it also runs jbig2 on 2-color images.

Image optimizers ruled out as a replacement for pngout:

  • pngout. Not open source.
  • sam2p. sam2p is already used and something like it will be used. It's fast, does colorspace optimization, and its output is passed to other image optimizers. (Thus it's not necessary that other image optimizers do colorspace optimization.)
  • imgdataopt. This will be the successor of sam2p. It is a rewrite of sam2p in C with the same features (as far as pdfsizeopt is concerned) but fewer dependencies.
  • jbig2. Doesn't support PNG as output.
  • cjpeg. Doesn't support PNG as output.
  • zopflipng. No command-line flag to force grayscale output.
  • ECT. No command-line flag to force grayscale output.
  • advpng. No command-line flag to force grayscale output.
  • pngwolf. No command-line flag to force grayscale output. (FYI Doesn't do colorspace optimization.)
  • pngwolf-zopfli. No command-line flag to force grayscale output. It's also much slower than pngout. (FYI Doesn't do colorspace optimization.) See also https://github.com/pts/pdfsizeopt/issues/87.

PDF input file with lots of images used for benchmarking: pngopttest.pdf 17705411 bytes, same as test.pdf in https://github.com/pts/pdfsizeopt/issues/87, contains 24 images of about the same byte size, dimensions: 768x512.

Benchmark results by pts:

  • Input PDF file: 17705411 bytes.
  • time pdfsizeopt --use-image-optimizer=pngout pngopttest.pdf pngopttest.pngout.pdf info: saved 2636000 bytes (15%) on optimizable images 90.17s user 0.45s system 99% cpu 1:31.05 total Output PDF file is 15059514 bytes. https://github.com/pts/pdfsizeopt/issues/87 contains an output PDF file of 14997954 bytes. Maybe it was running a more recent version of pngout? This was running pngout version 2015-05-19.
  • time pdfsizeopt --use-image-optimizer=pngout pngopttest.pdf pngopttest.pngout0.pdf info: saved 2632072 bytes (15%) on optimizable images 91.37s user 0.49s system 89% cpu 1:42.59 total Output PDF file is 15063446 bytes. This was running pngout version 2011-01-09.
  • time pdfsizeopt --use-image-optimizer=none pngopttest.pdf pngopttest.none.pdf info: saved 1935939 bytes (11%) on optimizable images 12.56s user 0.38s system 97% cpu 13.284 total Output PDF file is 15759577 bytes.
  • time pdfsizeopt --use-image-optimizer=optipng4 pngopttest.pdf pngopttest.optipng4.pdf saved 2427301 bytes (14%) on optimizable images 95.05s user 0.42s system 99% cpu 1:35.92 total Output PDF file is 15268217 bytes.
  • time pdfsizeopt --use-image-optimizer=optipng7 pngopttest.pdf pngopttest.optipng7.pdf info: saved 2438027 bytes (14%) on optimizable images 538.18s user 0.91s system 99% cpu 9:03.60 total Output PDF file is 15257493 bytes.
  • time pdfsizeopt --use-image-optimizer=pngwolf pngopttest.pdf pngopttest.pngwolf-zopfli.pdf saved 3091612 bytes (...%) on optimizable images 440s user 99% cpu 6:18.87 total Output PDF file is 14538414 bytes.

Maybe the input PDF is not representative, we need to run the benchmarks on more data.

It looks like optipng4 is the winner so far, but it still produces larger output than pngout, and it's a bit slower as well.

pts avatar Apr 18 '18 14:04 pts

what about pngquant, a lossy albeit high quality png optimizer?

carlkl avatar Aug 13 '18 12:08 carlkl

By principle, pdfsizeopt is not doing any lossy optimization. Until this principle is relaxed, pngquant is not a good replacement for pngout. Even if the principle gets relaxed, pngquant is still not a good replacement, because as a replacement we need a lossless PNG optimizer with similar speed and compression ratio as pngout.

pts avatar Aug 13 '18 13:08 pts

then maybe oxipng with pngquant as an additional option?

carlkl avatar Aug 13 '18 14:08 carlkl