StructEqTable-Deploy icon indicating copy to clipboard operation
StructEqTable-Deploy copied to clipboard

输出格式问题:Latex和HTML结果未对齐?

Open DayDreamerEric opened this issue 1 year ago • 3 comments

  • 测试图片: 213E8288-DEDB-497E-B208-5C5BC301867E

命令行: python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format latex python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format html

  • latex输出
\begin{tabular}{|l|l|l|l|}
\hline
\multirow{2}{*}{\textbf{名称}} & \multirow{2}{*}{\textbf{产量} (吨)} & \multicolumn{2}{c}{\textbf{环比}} \\
\cline{3-4}
 &  & \textbf{增长量} (吨) & \textbf{增长率} (\%) \\
\hline
荔枝 & 11 & 1 & 10\\
\hline
芒果 & 9 & --1 & --10\\
\hline
香蕉 & 6 & 1 & 20\\
\hline
\end{tabular}

GPT-4o转为HTML格式如下:

<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>
  • html输出
<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>
  • latex结果 vs. html结果 可视化对比 image

DayDreamerEric avatar Nov 12 '24 06:11 DayDreamerEric

刚刚拜读了DocGenome,其中「表结构识别」的任务定义是image2latex。

初步猜测,是否是由于训练数据的格式问题导致的呢?

DayDreamerEric avatar Nov 12 '24 06:11 DayDreamerEric

补充: markdown输出存在类似的问题

| 名称 | 产量 (吨) | 环比 |  | 
| --- | --- | --- | --- | 
|  |  | 增长量 (吨) | 增长率 (\%) | 
| 荔枝 | 11 | 1 | 10 | 
| 芒果 | 9 | -1 | -10 | 
| 香蕉 | 6 | 1 | 20 |
名称 产量 (吨) 环比
增长量 (吨) 增长率 (%)
荔枝 11 1 10
芒果 9 -1 -10
香蕉 6 1 20

DayDreamerEric avatar Nov 12 '24 10:11 DayDreamerEric

We have updated StructTable-InternVL2-1B with higher quality HTML and markdown SFT data to enhance the robustness and capabilities of table recognition in both HTML and markdown formats. We welcome you to try our latest model! Your feedback and valuable suggestions would be greatly appreciated.

PrinceVictor avatar Dec 12 '24 07:12 PrinceVictor