node-html-parser icon indicating copy to clipboard operation
node-html-parser copied to clipboard

How can I parse a document with errors in the closing of tags?

Open jpolstre opened this issue 2 years ago • 1 comments

Example:

var doc = nodeParse('
<html>
   <body>
     <table>
        <tbody>
            <tr>
              <td>
                 <a href="#" class="anchor" >link</a>
             <td>
          <tr>
          <tbody>// error close tag
       </table>
    </body>
 </html>')

var anchor = doc.querySelector('.anchor')
console.log(anchor.parentNode.parentNode.parentNode) //Returns <html..., when <tbody.. is expected.

In other languages ​​and with other packages I have no problem. I also don't want to be putting: voidTag:{ tags: ['area', 'base',...], in the configuration, since I don't know in which labels the error will appear. Is there a way to do what I'm looking for? Thank you for your package.

jpolstre avatar Oct 18 '22 23:10 jpolstre

https://github.com/taoqf/node-html-parser/issues/152 https://github.com/taoqf/node-html-parser/issues/231 too many issues about broken html, I really have no time on this. pr is welcomed.

taoqf avatar Aug 17 '23 03:08 taoqf