pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.

Results 108 pdf2json issues
Sort by recently updated
recently updated
newest added

I am wondering why `interface Text` has no height value? How can I calculate it?

my example: { "x": 13.391, "y": 2.677, "text": "H" }, { "x": 13.787, "y": 2.677, "text": "E" }, { "x": 14.22, "y": 2.677, "text": "L" }, { "x": 14.58, "y":...

Hi, I am using pdf2json 2.0.0 to parse the pdf contents in to a Json. My sample pdf has a radio button with 2 options. As per the pdf2json documentation...

TypeError: infoDict.has is not a function at PDFDocument.get documentInfo [as documentInfo] (eval at (/home/usr/repos/Work1/node_modules/pdf2json/lib/pdf.js:64:1), :4638:24) at LocalPdfManager_ensure [as ensure] (eval at (/home/usr/repos/Work1/node_modules/pdf2json/lib/pdf.js:64:1), :32503:22) at LocalPdfManager.BasePdfManager_ensureModel [as ensureModel] (eval at (/home/usr/repos/Work1/node_modules/pdf2json/lib/pdf.js:64:1),...

This code parses two pdf files and converts to rawtext and removes the line starting with 'Generated' and then compares those two text files.This method is called more than once...

I have a performance question. if i have a array of buffers and want to use .parseBuffer() inside a for loop, is it ok to instantiate `pdfParser = new PDFParser();`...

Hi, I am using the getRawTextContent() method. let pdfParser = new PDFParser(this, 1); pdfParser.setPassword(`mypassword`); pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError)); pdfParser.on("pdfParser_dataReady", pdfData => { fs.writeFile("parsed.txt", pdfParser.getRawTextContent(), () => { }); Now, if...

I try to parse a 300 page pdf and get the following content: XXXXXX ----------------Page (0) Break---------------- ----------------Page (1) Break---------------- ----------------Page (2) Break---------------- ----------------Page (3) Break---------------- This is my code:...

I want to get picture data from pdf. now I can only get text data from pdf by pdf2json,is it possible to get picture data? any ideas,thankyou!

For some reason when I run pdf2json on my electron app I get "Uncaught Error: No PDFJS.workerSrc specified". FYI: I tried setting workerSrc to pdf.worker.js but that won't solve it....