pdfjs
pdfjs copied to clipboard
Cannot read property 'compressed' of undefined
We ran into a pdfjs (v2.4.5) problem: https://github.com/nbesli/pdf-merger-js/issues/42
The following code-snipped...
const doc = new pdf.Document()
const src = await fs.readFile(path.join(FIXTURES_DIR, 'issue-42.pdf'))
const ext = new pdf.ExternalDocument(src)
doc.addPagesOf(ext)
const fileBuffer = await doc.asBuffer()
await fs.writeFile(path.join(TMP_DIR, 'Testfile_issue-42.pdf'), fileBuffer)
...results in this error:
TypeError: Cannot read property 'compressed' of undefined
at parseObject (node_modules/pdfjs/lib/object/reference.js:81:15)
at PDFReference.get [as object] (node_modules/pdfjs/lib/object/reference.js:15:17)
at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:68:35)
at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:84:18)
at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:75:16)
at ExternalDocument.write (node_modules/pdfjs/lib/external.js:62:14)
Please find the problematic PDF file attached: issue-42.pdf
Thanks for the report! I looked into it and the cause of the issue seems to be that pdfjs
does not support hybrid-reference files. More specifically, the support for the XRefStm
property of the trailer is not yet implemented. While it successfully falls back to the normale xref table (instead of the xref stream), the normal xref table is missing the object with the ID 46
, which is thus unknown and causes the error you've posted.
Possible solutions:
- Implement support for
XRefStm
- Silently ignore missing objects (I am not sure if I'd like this solution though)
I don't have the time right now to implement it, but I'll keep it in the back of my mind.
Any suggested temporary fixes that we might be able to use to circumvent this error while the issue is waiting to be resolved?
Can you check if the PDF is hybrid reference or not? I'm currently having this problem, and I want to prevent the pdf merge if there's a way to check for that.
Hi everyone! I have a small solution, but it will not suit everyone. And we need to use node-pdftk
import pdf from 'pdfjs';
import fs from 'fs';
import pdftk from 'node-pdftk';
const src = await pdftk.input('issue-42.pdf').output(); //
const doc = new pdf.Document();
const ext = new pdf.ExternalDocument(src);
doc.addPagesOf(ext);
const fileBuffer = await doc.asBuffer();
fs.writeFileSync('Testfile_issue-42.pdf', fileBuffer);
Looks like node-pdftk
extracts xref table
from a xref stream
(It means that a file will weigh more). So, pdfjs
can work with it.
Testfile_issue-42.pdf looks the same after launching the code above. But links now it's just a text.
Running into this now.. Any actual fixes?