pdf2json Possible to parse a subset of pdf?

Possible to parse a subset of pdf?

Open zhirzh opened this issue 7 years ago • 1 comments

For huge PDFs with ~1000 pages, is it possible to begin parsing from a particular page/index? This will reduce time significantly.

May 06 '17 12:05 zhirzh

I was going through the code and I think it might possible. In https://github.com/modesty/pdf2json/blob/0433368198a2faa8600a854d96724719b4ba5ce0/lib/pdf.js#L334, a for loop traverses over range [1, pagesCount]. If instead some indexes are passed?

May 06 '17 12:05 zhirzh

pdf2json pdf2json copied to clipboard

Possible to parse a subset of pdf?

pdf2json
pdf2json copied to clipboard