pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

extracted data in the form of encoding format

Open duvemula opened this issue 7 years ago • 3 comments

Hi @modesty,

When I read data from pdf file the output data is in the form of encoding format. Can you please take a look? Periods.pdf

Code snippet: var pdfParser = new PDFParser(this,1); pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError) ); pdfParser.on("pdfParser_dataReady", pdfData => { console.log(pdfParser.getRawTextContent()); //console.log(pdfParser.getAllFieldsTypes()); fs.writeFile('' + filePath + '/Axioma Optimization 101.txt', pdfParser.getRawTextContent()); });

    pdfParser.loadPDF('' + filePath + '/Periods.pdf'); 

Regards, Durga Prasad

duvemula avatar Jun 22 '17 17:06 duvemula

what do you mean by in the form of encoding format? what is the expected output

wanghaisheng avatar Jul 06 '17 12:07 wanghaisheng

Sorry for the delay @wanghaisheng , My pdf file is PDF image file. Can we read data from PDF image file.

Reards, Durga Prasad

duvemula avatar Oct 16 '17 17:10 duvemula

@duvemula you can try ocrmypdf ,a wonderful library

wanghaisheng avatar Oct 17 '17 14:10 wanghaisheng