node-html-parser
node-html-parser copied to clipboard
text should only return human readable text
I notice that HTMLElement.text returns text content of script and style tags too. Expected behavior of it is to not include those, as innerText should return only human readable content.
On the other hand textContent can return all text content, even if not human readable.
I had the same issue, If your use case make that you never need the content of script and style tag, the parser have the options to ignore those tag from the start ->
HTMLParser.parse(text, {
comment: false,
blockTextElements: {
noscript: false,
script: false,
style: false,
pre: false,
},
})