html2openxml
html2openxml copied to clipboard
Table with multiple colspans and rowspans
Given the following HTML table:
<table border="1">
<thead>
<tr>
<th colspan="2" rowspan="2">Header 1</th>
<th colspan="2">Header 2</th>
<th colspan="2" rowspan="2">Header 3</th>
</tr>
<tr>
<th>Sub-header 2.1</th>
<th>Sub-header 2.2</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Data 1.1</td>
<td rowspan="2">Data 1.2</td>
<td rowspan="2">Data 2.1</td>
<td rowspan="2">Data 2.2</td>
<td>Data 3.1.1</td>
<td>Data 3.1.2</td>
</tr>
<tr>
<td>Data 3.2.1</td>
<td>Data 3.2.2</td>
</tr>
</tbody>
</table>
The HTML output renders as:
However the Word output is:
I have highlighted in red an empty cell which is added in the wrong location.
The differences I have found in the Open XML are:
- The actual output has only 3 gridCol elements in tblGrid, whereas the expected has 6
- The actual output has a tc placed before Sub-header 2.1 whereas the expected has it placed after Sub-header 2.2
The issue is in ProcessClosingTableRow - but I'm not fully following the code.
In my use case, an emptyCell is added correctly at index 0, but on the second iteration the code drops into a do {} while loop which eventually inserts a new emptyCell after the first cell in the row (which is the new empyCell added on the previous loop). The code isn't working out that the emptyCell needs adding after cell sub-header 2.2
Hi Paul, after investigation, part of the problem is in ProcessClosingTable when counting the GridSpan.
The problem is in this line:
row.ChildElements[i].GetFirstChild<GridSpan>()
because it should be instead:
row.ChildElements[i].TableTableCellProperties?.GridSpan
(note the use of TableCellProperties) This resolve only the 1st difference on the 6 gridCol.
Hi Paul and Olivier, I had a similar issue with tables. My table is composed before sending content to converter and parse the content
try
{
var paragraphs = converter.Parse(HeaderData + ContentData + FooterData);
return (List<OpenXmlCompositeElement>)paragraphs;
}
catch
{
return null;
}
and the input on converter.Parse is
<table width="100%" border="1">
<tr>
<td colspan="3" align="center" valign="middle"><font style="font-weight: bold;">SALDOS INICIALES</font></td>
</tr>
<tr>
<td valign="middle"><font style="font-weight: bold;">Denominación</font></td>
<td valign="middle"><font style="font-weight: bold;">Nº Cuenta</font></td>
<td valign="middle"><font style="font-weight: bold;">Saldo</font></td>
</tr>
<tr>
<td valign="middle">KUTXABANK (100 %)</td>
<td valign="middle">****-****-**-**********</td>
<td valign="middle" align="right">21.487,71 €</td>
</tr>
<tr>
<td valign="middle">Plazo Fijo KUTXABANK (100%)</td>
<td valign="middle">****-****-**-**********</td>
<td valign="middle" align="right">25.000,00 €</td>
</tr>
<tr>
<td colspan="2" valign="middle"><font style="font-weight: bold;">TOTAL</font></td>
<td valign="middle" align="right"><font style="font-weight: bold;">46.487,71 €</font></td>
</tr>
</table>
the output on the docx is
what am i doing wrong?
Hi Olivier, I found a issue related with colspans based on my previous code. The property valign doesn't work fine with colspans. I deleted it from my table and works like a charm.
Hi everyone, did you found any workaround for this, or only way to do this is to work with raw openxml ie. using documentformat.openxml?
My specific case was fixed with 4aa22d6510d0c157110a538663eb35b916cd1a71 so from my point of view this can be closed.
Hi all, I have the same issue with this (same table layout with Paul). I'm using the latest 2.3.0 release from Nuget.. and it's not resolved yet.