TCPDF
TCPDF copied to clipboard
Feature proposal: optimize PDFs by replacing %F coordinates
Inspecting the uncompressed PDF code, a great many cases where coordinates are specified with unnecessary precision can be found. This is absolutely normal for all PDFs (e.g. 17.5 specified as 17.500000, or 0 specified as 0.000000). Nonetheless, in very detailed PDFs with typeset tables and no images, replacing with a more compact expression and even truncating floats (e.g. 17.4996234 becoming 17.5) yields files size savings of up to 25% of the compressed file (verified by myself) with no noticeable.
At the unavoidable cost of a slower PDF generation, this can be achieved by replacing the sprintf() calls in the code that have a %F format specifier with calls to $this->ksprintf, this being a function that recognizes and "shaves" floats by leveraging the fact that only the %F format is ever used in the TCPDF code.
This is in no way a replacement/wrapper for the actual sprintf function.
If anyone wants to experiment:
/**
* In most cases, floating point arguments in PDF have higher precision than necessary, and are often
* unnecessarily specified - e.g. "Tw 17.500000" instead of "Tw 17.5". This function replaces it,
* keeping things like "Text (Price: 17.500000)" untouched.
* This usually decreases the size of compressed PDFs by 5-25% depending on the contents.
*
* Possibly, the precision could even be user-selectable ($pdf->setFloatPrecision(int $digits = 3))
*
* @param ...$args
*
* @return string
*/
protected function ksprintf(...$args): string {
$format = array_shift($args);
if (preg_match_all('#%(.)#', $format, $gregs, PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER)) {
foreach ($gregs[1] as $i => [$par, $off]) {
if ('F' === $par) {
$format[$off] = 's';
$args[$i] = preg_replace(['#\.([^0]*)0+$#', '#\.$#'], ['\1', ''], round($args[$i], 3));
}
}
}
return sprintf($format, ...$args);
}