Ignore Form as well as Image XObjects when assembling the text array for a PDFObject.
Fix for #782
Thank you for your PR.
Is it still work in progress?
If not, there are a few tasks left to solve before I take a closer look. Please read https://github.com/smalot/pdfparser/blob/master/CONTRIBUTING.md for more information.
Thanks for the reminder @k00ni. I've added test coverage for the change.
That change from "Imo" to "Im0" was just correcting a typo in the existing test. I didn't spot that I got that wrong when I wrote it.
I could revert that line and submit it as a separate PR if you like? I think keeping the new test coverage in the same method as the existing coverage makes sense, as they're testing the same bit of code.
Also, to clarify: when the command in the test data is "/Imo Do", the test passes, but for the wrong reason. We're checking for no result for that XObject, and we get no result because it can't find an object called Imo.
When the command is "/Im0 Do", we still get no result, but we're getting it for the right reason. The code finds the XObject, sees that it's an image and then decides not to include it in the text array.
Sorry for the delayed response.
I follow your arguments, it looks good to me. The documentation provided in #782 was very helpful.
Thankyou!