combine_pdf
combine_pdf copied to clipboard
PDF 1.5 Object streams found - they are not fully supported! attempting to extract objects.
gem 'combine_pdf', :git => 'https://github.com/boazsegev/combine_pdf.git' when CombinePDF.load('file').pages[0] warning PDF 1.5 Object streams found - they are not fully supported! attempting to extract objects. Because this new version(March 27)
This shouldn't be a version related warning... it should come up only for certain PDF files where parsing might be only partially supported (due to compression and encryption concerns), warning about the possibility of parsing errors.
Did you test the same file with previous versions?
Do you experience and actual issue with the result?
I have come across PDFs which have object streams. PDFs with object-streams are definitely later than PDF 1.5 version, but not all PDF1.5 documents would have object streams in them.
I believe the object stream problem is an encryption problem. There are tools which could inflate a PDF with object streams into a PDF which doesn't use object streams. But this is not in CombinePDF yet.
@igbanam , thanks for adding your knowledge to the discussion.
Could you send me and example PDF that doesn't work? one where you experience data-loss when opening using CombinePDF?
I'll be happy to try and track down the issue and see what I can do about it.
I cannot release the PDF I found this out with—proprietary issues and all. But if I find another which I can share, I would attach it on here somehow.
@igbanam thank you.
When you find one you can send, feel free to email me instead of posting on this issue, this way it won't be posted publicly (if that matters).
B.
@igbanam , I have no idea if this might solve the issue, but I just released a version that includes improved support for Object Streams.
B.
Hi - don't know if this helps, but I was seeing this warning using combine_pdf v1.03, but when I upgraded to v1.0.16 the warning went away. This is when parsing this UK government document: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/512167/LIT_6872.pdf In both cases, the warning had no visible impact i.e. the resultant PDF seemed okay.
Hello, today I stumbled with this issue also but with:
pdf = CombinePDF.parse Net::HTTP.get_response(URI.parse(url)).body
I solved it adding the "allow_optional_content: true" argument to the parse command like this:
pdf = CombinePDF.parse( Net::HTTP.get_response(URI.parse(url)).body, allow_optional_content: true)
I am now able to combine PDF 1.5 Object streams
Hope this helps!