Circular references on Page Tree causes PDF::Reader to crash with `SystemStackError`
Pages-tree-refs.pdf (source) Running the following script with the attached PDF renders the following error:
require "bundler/inline"
gemfile do
gem "pdf-reader"
end
PDF::Reader.new("Pages-tree-refs.pdf").pages
# /usr/local/bundle/gems/pdf-reader-2.12.0/lib/pdf/reader/reference.rb:65:in `hash': stack level too deep (SystemStackError)
This is caused by a circular reference with Page Tree objects:
% ...
1 0 obj
<< /Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<< /Type /Pages
/Kids [6 0 R 3 0 R]
/Count 2
/MediaBox [0 0 595 842]
>>
endobj
3 0 obj
<< /Type /Pages
/Kids [4 0 R]
/Count 1
/MediaBox [0 0 595 842]
>>
endobj
4 0 obj
<< /Type /Pages
/Kids [5 0 R]
/Count 1
/MediaBox [0 0 595 842]
>>
endobj
5 0 obj
<< /Type /Pages
/Kids [3 0 R]
/Count 1
/MediaBox [0 0 595 842]
>>
endobj
% ...
Here we can observe that 2 0 R is the root, that has two children: 6 0 R and the problematic 3 0 R:
3 0 R --> 4 0 R --> 5 0 R --> 3 0 R <-- the cycle restarts here.
I would like to give an shot to solve this, may I do it?
Context: I've been using PDF::Reader as a dependency of a gem created for my undergraduate thesis (https://github.com/tomascco/rubrik). As part of my research, I've tested PDF::Reader against some of the PDFs on the pdf.js repository (https://github.com/mozilla/pdf.js/tree/master/test/pdfs) and found some cases like this one.
I'd also like give some feedbacks as someone that used PDF::Reader as a dependency for a higher level PDF interface.
Would these patches and suggestions be welcome? @yob
Would these patches and suggestions be welcome?
absolutely!