PdfBox-Android icon indicating copy to clipboard operation
PdfBox-Android copied to clipboard

Missing letters/text when converting some pdf files to images

Open kmichalak23 opened this issue 7 years ago • 4 comments

I was testing the conversion of various pdf files to images and found that sometimes text is missing (randomly) from the final produced image. These are more complex pdf files so that may be the reason, but it is still strange how only some of the text doesn't appear in the conversion.

The sample pdf files I have been using for testing were obtained online and here are two of the files that produce this issue:

Mississauga_Advantages.pdf

produces:

mississauga_advantages

meanwhile:

pdf.pdf

produces:

pdf

I understand that this might be a deeper issue due to the complexity of the files themselves, just wanted to make this issue known.

Thanks

kmichalak23 avatar Dec 16 '16 19:12 kmichalak23

The glyphs display fine with the 1.8.13 desktop version. I remember such problems (trouble with type 1 fonts) with jdk 1.7. The bugs were gone with jdk 1.8. Thus I suspect that whatever runs on Android has the bugs that were in jdk 1.7. The PDFBox 2.0.* versions use their own code to render type 1 fonts.

THausherr avatar Jan 26 '17 19:01 THausherr

Thanks for the observation @THausherr. At least I know the problem isn't isolated to my machine.

kmichalak23 avatar Feb 01 '17 13:02 kmichalak23

No, this is definitely a problem with the library.

TomRoush avatar Feb 17 '17 22:02 TomRoush

Here is a reduced version of that file. It should display 995. If the bug still occurs, "9 5" should appear. PDFBOXAndroid-92-reduced.pdf

THausherr avatar Feb 24 '19 08:02 THausherr