pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

Possible to parse a subset of pdf?

Open zhirzh opened this issue 7 years ago • 1 comments

For huge PDFs with ~1000 pages, is it possible to begin parsing from a particular page/index? This will reduce time significantly.

zhirzh avatar May 06 '17 12:05 zhirzh

I was going through the code and I think it might possible. In https://github.com/modesty/pdf2json/blob/0433368198a2faa8600a854d96724719b4ba5ce0/lib/pdf.js#L334, a for loop traverses over range [1, pagesCount]. If instead some indexes are passed?

zhirzh avatar May 06 '17 12:05 zhirzh