TableBank
TableBank copied to clipboard
Table Detection data mismatch in Word subset
I have downloaded and checked the TableBank dataset from your dataset homepage
I have found some issues in the annotations, the README denotes the number of tables in the Table Detection task as follows:
Task | Word | Latex | Word+Latex |
---|---|---|---|
Table detection | 163,417 | 253,817 | 417,234 |
But I ran my script to check the data annotations, it showed that there were only 101889 tables
in the Word subset.