parser icon indicating copy to clipboard operation
parser copied to clipboard

Iconv-lite warning: decode()-ing strings is deprecated.

Open arlogilbert opened this issue 6 years ago • 2 comments

Installed mercury-parser from NPM v2.0.0

Works great, thanks!

Every execution however throws an "Iconv-lite warning: decode()-ing strings is deprecated." error.

We are passing in strings per the documentation. This is fixable by converting the string to a buffer before passing it to Mercury using something along the lines of Buffer.from(htmlSource, 'utf8');

Expected behavior would be to either disallow sending strings to Mercury, to convert strings automagically to buffers, or to override the iconv error with iconv.skipDecodeWarning = true; per the iconv-lite author's recommendation

arlogilbert avatar Apr 05 '19 23:04 arlogilbert

Passing html string to the parser as per the documentation causes incorrect parsing.Some text are not properly decoded.Using Buffer with utf-8 encoding fixes the issue

farmaan-appachhi avatar May 24 '19 20:05 farmaan-appachhi

I discovered the same issue. Thanks to recommendation by @arlogilbert, I was able to get my code to work using the following format:

Mercury.parse(urlToParse, { html: Buffer.from(htmlText, 'utf8') });

It also appears that the word_count result is incorrect when parsing a string of html. I noticed that the result was always 1 no matter how long the text.

knowmad avatar Jul 06 '19 03:07 knowmad