DAVAR-Lab-OCR
DAVAR-Lab-OCR copied to clipboard
生成的表格
["
我生成的表格是没有thead 和tbody 符号,这个符号一定需要?导致: def get_headbody(html_str): """Calculating number of bboxes belonging to "t-head" and "t-body" respectively
Args:
html_str(str): html representing table structure
Returns:
int: number of bboxes belonging to "t-head"
int: number of bboxes belonging to "t-body"
"""
# html_code = ''.join(html_str)
# html_str = list('''<html><body><table>%s</table></body></html>''' % html_code)
s_h, e_h = html_str.index('<thead>'), html_str.index('</thead>')
s_b, e_b = html_str.index('<tbody>'), html_str.index('</tbody>')
num_h = html_str[s_h + 1:e_h].count('</td>')
num_b = html_str[s_b + 1:e_b].count('</td>')
return num_h, num_b
这个函数转换失败