StructEqTable-Deploy 输出格式问题：Latex和HTML结果未对齐？

测试图片：

命令行： python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format latex python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format html

latex输出

\begin{tabular}{|l|l|l|l|}
\hline
\multirow{2}{*}{\textbf{名称}} & \multirow{2}{*}{\textbf{产量} (吨)} & \multicolumn{2}{c}{\textbf{环比}} \\
\cline{3-4}
 &  & \textbf{增长量} (吨) & \textbf{增长率} (\%) \\
\hline
荔枝 & 11 & 1 & 10\\
\hline
芒果 & 9 & --1 & --10\\
\hline
香蕉 & 6 & 1 & 20\\
\hline
\end{tabular}

GPT-4o转为HTML格式如下：

<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>

html输出

<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>

latex结果 vs. html结果可视化对比

Nov 12 '24 06:11 DayDreamerEric

刚刚拜读了DocGenome，其中「表结构识别」的任务定义是image2latex。

初步猜测，是否是由于训练数据的格式问题导致的呢？

Nov 12 '24 06:11 DayDreamerEric

补充： markdown输出存在类似的问题

| 名称 | 产量 (吨) | 环比 |  | 
| --- | --- | --- | --- | 
|  |  | 增长量 (吨) | 增长率 (\%) | 
| 荔枝 | 11 | 1 | 10 | 
| 芒果 | 9 | -1 | -10 | 
| 香蕉 | 6 | 1 | 20 |

名称	产量 (吨)	环比
		增长量 (吨)	增长率 (%)
荔枝	11	1	10
芒果	9	-1	-10
香蕉	6	1	20

Nov 12 '24 10:11 DayDreamerEric

We have updated StructTable-InternVL2-1B with higher quality HTML and markdown SFT data to enhance the robustness and capabilities of table recognition in both HTML and markdown formats. We welcome you to try our latest model! Your feedback and valuable suggestions would be greatly appreciated.

Dec 12 '24 07:12 PrinceVictor