pdf2json
pdf2json copied to clipboard
Unexpected error "stream must have data"
From time to time I got this error message An error occurred while parsing the PDF: stream must have data
. Some time I pass all pdfs without error and some time with error.
I have 2000 pdfs stored on disk, with loop fetch one by one and I want extract text.
async function parsePdf(file: any): Promise<any> {
return new Promise((resolve, reject) => {
const pdfParser = new PDFParser(this, 1);
pdfParser.loadPDF(file.path);
pdfParser.on("pdfParser_dataError", errData => {
console.error(errData);
reject(errData.parserError)
});
pdfParser.on("pdfParser_dataReady", pdfData => resolve(pdfParser.getRawTextContent()));
});
}
Is someone can help me with this issue. Where I'm wrong. Thank you.
EDIT: If I use call directly with buffer, everything work as expected.
Well, for me using pdfParser.loadPDF works and using buffer dont.
I get the same problem.
Any update on this issue ?
@lourencomcviana same here.
Same here. If you pipe a stream into the parser and suddenly the connection breaks, the parser gets stuck and doesn't throw/exit.