npm-pdfreader icon indicating copy to clipboard operation
npm-pdfreader copied to clipboard

Cannot read text based PDF file content

Open bahadirarslan opened this issue 1 year ago • 4 comments

I am trying to read a PDF file contents. It contains tabular text data and I tried the most basic example code to see how this package reads contents.

But neither is there an error nor does item.text has a value

const { PdfReader } = await import("pdfreader"); // I am using CommonJS so I dynamically imported library.
    new PdfReader().parseFileItems(path.resolve(__dirname, `../../${filePath}`), (err, item) => {
//filePath was relative so I tried to resolve it, maybe this was the cause
        if (err) console.error("error:", err);
        else if (!item) console.warn("end of buffer");
        else if (item.text) console.log(item.text);
    });

bahadirarslan avatar Feb 23 '24 09:02 bahadirarslan

It's hard to help you without being able to reproduce the issue.

In order to help us do that, please share the PDF file.

adrienjoly avatar Feb 23 '24 10:02 adrienjoly

I am sorry but I can not share the PDF file because it contains some personal info. But if you provide me an alternative way to share it with you, I can send it.

bahadirarslan avatar Feb 23 '24 20:02 bahadirarslan

Any update about this? I have the same problem.

murillocandioto avatar Apr 24 '24 12:04 murillocandioto

I have the same answer: please share a PDF file that we can use to reproduce the problem, so we can help.

adrienjoly avatar Apr 24 '24 14:04 adrienjoly