pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

How to overwrite library method

Open mandys opened this issue 3 years ago • 1 comments

Hi,

I am using the getRawTextContent() method.

let pdfParser = new PDFParser(this, 1); pdfParser.setPassword(mypassword); pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError)); pdfParser.on("pdfParser_dataReady", pdfData => { fs.writeFile("parsed.txt", pdfParser.getRawTextContent(), () => {

});

Now, if I use the one provided by library, the words in my line don't have proper spacing. However, if I alter the method, it works fine.

eg:

From pdf.js

If I just change

prevText.str += textObj.str;

to

prevText.str += textObj.str + " ";

All my code works fine.

But, I want to know the best way to override this function in my code.

cls.prototype.getRawTextContent = function() {
    let retVal = "";
    if (!this.needRawText)
        return retVal;

    _.each(this.rawTextContents, function(textContent, index) {
        let prevText = null;
        _.each(textContent.bidiTexts, function(textObj, idx) {
            if (prevText) {
	            if (Math.abs(textObj.y - prevText.y) <= 9) {
		            **prevText.str += textObj.str;**
	            }
	            else {
		            retVal += prevText.str  + "\r\n";
		            prevText = textObj;
	            }
            }
            else {
	            prevText = textObj;
            }

        });
        if (prevText) {
	        retVal += prevText.str;
        }
        retVal += "\r\n----------------Page (" + index + ") Break----------------\r\n";
    });

    return retVal;
};

mandys avatar Nov 12 '20 11:11 mandys