node-html-parser
node-html-parser copied to clipboard
How can I parse a document with errors in the closing of tags?
Example:
var doc = nodeParse('
<html>
<body>
<table>
<tbody>
<tr>
<td>
<a href="#" class="anchor" >link</a>
<td>
<tr>
<tbody>// error close tag
</table>
</body>
</html>')
var anchor = doc.querySelector('.anchor')
console.log(anchor.parentNode.parentNode.parentNode) //Returns <html..., when <tbody.. is expected.
In other languages and with other packages I have no problem. I also don't want to be putting: voidTag:{ tags: ['area', 'base',...], in the configuration, since I don't know in which labels the error will appear. Is there a way to do what I'm looking for? Thank you for your package.
https://github.com/taoqf/node-html-parser/issues/152 https://github.com/taoqf/node-html-parser/issues/231 too many issues about broken html, I really have no time on this. pr is welcomed.