Tables parsing issue - with <tr> inside <tr> case
When working with code generated by Microsoft Outlook I found a case where DOMDocument based parser has no problem with specific code, but HTML5 parser does.
The minimal test case input is this:
<table id="t1">
<tr>
<td>
<table id="t2">
<tr>
<tr>
<td></td>
</tr>
</tr>
</table>
</td>
</tr>
<tr><td></td></tr>
</table>
Note the <tr> element as a child of another <tr>. This causes HTML5 parser to output:
<table id="t1">
<tr>
<td>
<table id="t2">
<tr></tr>
<tr>
<td></td>
</tr>
</table>
</td>
</tr>
</table>
<tr><td></td></tr>
Which obviously is invalid and causes the parent table to be "closed" before it should, leaving the next (here: last) tr element outside of the table.
Reference: https://github.com/roundcube/roundcubemail/issues/7356
Since <tr> is not a valid child for <tr>, what would be the suggested solution here? What browsers do?
Both Firefox and Chrome convert the t2 table to:
<table id="t2">
<tbody>
<tr></tr>
<tr>
<td></td>
</tr>
</tbody>
</table>
Sorry, I wasn't clear. The t2 table is the same as in HTML5 output. The difference in the browser is that the outer table is not broken, i.e. the second row is where it should be.
So, the issue here is not the content of the inner table, but that it has impact on the outer table.
ah, is see, indeed those <tr><td></td></tr> are wrong