node-htmlparser icon indicating copy to clipboard operation
node-htmlparser copied to clipboard

Ignore html tags inside of <SCRIPT> tags

Open indexzero opened this issue 14 years ago • 4 comments

node-htmlparser fails when the following snippet is present in the html document.

http://gist.github.com/498560

Clearly since the offending tag <SCR"+"IPT is inside of another <SCRIPT> tag node-htmlparser should not be trying to parse it.

indexzero avatar Jul 31 '10 17:07 indexzero

This issue is the same as Issue 2. The root of the problem is this line and regex: Parser._reTags = /[<>]/g

htmlparser doesn't know this tag is in a string.

silentrob avatar Aug 03 '10 02:08 silentrob

This would cause the same problem

silentrob avatar Aug 03 '10 02:08 silentrob

I am adding these as tests cases (amongst many others) as I plug away a v2.0 of the parser. Any other permutations, please do not hesitate to add them here.

tautologistics avatar Aug 10 '10 13:08 tautologistics

Updated tag

tautologistics avatar Oct 04 '10 13:10 tautologistics