node-html-parser icon indicating copy to clipboard operation
node-html-parser copied to clipboard

text should only return human readable text

Open ucarbehlul opened this issue 3 years ago • 1 comments

I notice that HTMLElement.text returns text content of script and style tags too. Expected behavior of it is to not include those, as innerText should return only human readable content.

On the other hand textContent can return all text content, even if not human readable.

ucarbehlul avatar Mar 21 '22 22:03 ucarbehlul

I had the same issue, If your use case make that you never need the content of script and style tag, the parser have the options to ignore those tag from the start ->

HTMLParser.parse(text, {
  comment: false,
  blockTextElements: {
    noscript: false,
    script: false,
    style: false,
    pre: false,
  },
})

xileftenurb avatar May 29 '23 16:05 xileftenurb