PdfBox-Android
PdfBox-Android copied to clipboard
Missing letters/text when converting some pdf files to images
I was testing the conversion of various pdf files to images and found that sometimes text is missing (randomly) from the final produced image. These are more complex pdf files so that may be the reason, but it is still strange how only some of the text doesn't appear in the conversion.
The sample pdf files I have been using for testing were obtained online and here are two of the files that produce this issue:
produces:
meanwhile:
produces:
I understand that this might be a deeper issue due to the complexity of the files themselves, just wanted to make this issue known.
Thanks
The glyphs display fine with the 1.8.13 desktop version. I remember such problems (trouble with type 1 fonts) with jdk 1.7. The bugs were gone with jdk 1.8. Thus I suspect that whatever runs on Android has the bugs that were in jdk 1.7. The PDFBox 2.0.* versions use their own code to render type 1 fonts.
Thanks for the observation @THausherr. At least I know the problem isn't isolated to my machine.
No, this is definitely a problem with the library.
Here is a reduced version of that file. It should display 995. If the bug still occurs, "9 5" should appear. PDFBOXAndroid-92-reduced.pdf