node-htmlparser
node-htmlparser copied to clipboard
Ignore html tags inside of <SCRIPT> tags
node-htmlparser fails when the following snippet is present in the html document.
http://gist.github.com/498560
Clearly since the offending tag <SCR"+"IPT is inside of another <SCRIPT> tag node-htmlparser should not be trying to parse it.
This issue is the same as Issue 2. The root of the problem is this line and regex: Parser._reTags = /[<>]/g
htmlparser doesn't know this tag is in a string.
This would cause the same problem
I am adding these as tests cases (amongst many others) as I plug away a v2.0 of the parser. Any other permutations, please do not hesitate to add them here.
Updated tag