pdfrw icon indicating copy to clipboard operation
pdfrw copied to clipboard

Did not find PDF object (1, 0)

Open manishhub9 opened this issue 7 years ago • 6 comments

with open( 'raw.pdf', 'wb') as pdf_file:
	pdf_file.write(data)
writer.addpages(PdfReader('new.pdf').pages)
writer.write("_signed_manifest.pdf")

Error i get:

[WARNING] tokens.py:221 Indirect object 5 0 obj found at incorrect offset 113236 (expected offset 113178) (line=810, col=1, token='4')
[WARNING] tokens.py:221 Indirect object 4 0 obj found at incorrect offset 113178 (expected offset 112991) (line=795, col=1, token='3')
[WARNING] tokens.py:221 Indirect object 3 0 obj found at incorrect offset 112991 (expected offset 112894) (line=788, col=1, token='2')
[WARNING] tokens.py:221 Indirect object 2 0 obj found at incorrect offset 112894 (expected offset 9) (line=2, col=1, token='1')
[WARNING] tokens.py:221 stream keyword terminated by \r without \n (line=791, col=1, token='stream')
[WARNING] tokens.py:221 Did not find PDF object (1, 0) (line=794, col=1, token='endobj')

manishhub9 avatar Oct 13 '17 11:10 manishhub9

Probably a broken file.

pmaupin avatar Oct 27 '17 22:10 pmaupin

a workaround on windows is to open the 'broken' pdf and print it "as pdf" per windows. the new 're-printed' pdf works correctly.

I have the same pattern of error messages, and the fork from https://github.com/sarnold/pdfrw doesn't resolve it

[WARNING] tokens.py:221 Indirect object 5 0 obj found at incorrect offset 430213 (expected offset 430155) (line=2918, col=1, token='4')
[WARNING] tokens.py:221 Indirect object 6 0 obj found at incorrect offset 430264 (expected offset 430213) (line=2925, col=1, token='5')
[WARNING] tokens.py:221 Indirect object 4 0 obj found at incorrect offset 430155 (expected offset 429968) (line=2903, col=1, token='3')
[WARNING] tokens.py:221 Indirect object 3 0 obj found at incorrect offset 429968 (expected offset 429871) (line=2896, col=1, token='2')
[WARNING] tokens.py:221 Indirect object 2 0 obj found at incorrect offset 429871 (expected offset 9) (line=2, col=1, token='1')
[WARNING] tokens.py:221 stream keyword terminated by \r without \n (line=2899, col=1, token='stream')
[WARNING] tokens.py:221 Did not find PDF object (1, 0) (line=2902, col=1, token='endobj')

stonebig avatar Nov 06 '21 18:11 stonebig

anyway, maybe it's a "strange" pdf generated, but it's not broken as windows can open it... a better handling shall be possible

stonebig avatar Nov 06 '21 18:11 stonebig

if I open the "bad original" per notepad++ on Windows and the "reprinted per windows", the beginning is interesting:

old bad:

%PDF-1.3
1 0 obj
<</Type /XObject /Subtype /Image /Name /Im1 /Width 1654 /Height 2338 /Length 429678/ColorSpace /DeviceRGB /BitsPerComponent 8 /Filter [ /DCTDecode ] >> stream
ÿØÿà

new good:

%PDF-1.7

4 0 obj
<<
/BitsPerComponent 8
/ColorSpace /DeviceRGB
/Filter /DCTDecode
/Height 52
/Length 6405
/Subtype /Image
/Type /XObject
/Width 1654
>>
stream
ÿØÿà

stonebig avatar Nov 06 '21 18:11 stonebig

The "bad" version seems to use <CR> as the carriage return on main partes, while "normal" pdf use apparently <LF> ==> a Mac thing ? image

old bad; image

new reprinted: image

stonebig avatar Nov 06 '21 18:11 stonebig