pdf2json
pdf2json copied to clipboard
How to overwrite library method
Hi,
I am using the getRawTextContent() method.
let pdfParser = new PDFParser(this, 1);
pdfParser.setPassword(mypassword
);
pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
pdfParser.on("pdfParser_dataReady", pdfData => {
fs.writeFile("parsed.txt", pdfParser.getRawTextContent(), () => {
});
Now, if I use the one provided by library, the words in my line don't have proper spacing. However, if I alter the method, it works fine.
eg:
From pdf.js
If I just change
prevText.str += textObj.str;
to
prevText.str += textObj.str + " ";
All my code works fine.
But, I want to know the best way to override this function in my code.
cls.prototype.getRawTextContent = function() {
let retVal = "";
if (!this.needRawText)
return retVal;
_.each(this.rawTextContents, function(textContent, index) {
let prevText = null;
_.each(textContent.bidiTexts, function(textObj, idx) {
if (prevText) {
if (Math.abs(textObj.y - prevText.y) <= 9) {
**prevText.str += textObj.str;**
}
else {
retVal += prevText.str + "\r\n";
prevText = textObj;
}
}
else {
prevText = textObj;
}
});
if (prevText) {
retVal += prevText.str;
}
retVal += "\r\n----------------Page (" + index + ") Break----------------\r\n";
});
return retVal;
};
I have a very similar problem, did you manage to solve yours? @mandys