pdfparser icon indicating copy to clipboard operation
pdfparser copied to clipboard

phpunit pdf tofu characters detection

Open 8ctopus opened this issue 1 year ago • 0 comments

I'm trying to design a phpunit test to detect tofu characters within a generated pdf. (If none of the fonts included in the pdf supports the language within the pdf, tofu characters will appear.)

First, I tried to get the pdf text, however the getText method always returns the correct unicode text, even if tofu characters are seen within the pdf.

Second, I've considered listing the available fonts and simply reviewing that all the required fonts are present.

$parser = new PdfParser();
$document = $parser->parseFile($pdf);

$fonts = $document->getFonts();

foreach ($fonts as $font) {
    $font->getDetails();
}

Would anyone have a better approach to suggest?

8ctopus avatar Oct 10 '24 12:10 8ctopus