unitable
unitable copied to clipboard
Questions about joint training with html and bbox.
unitable uses two different models to predict html
tokens and bbox
tokens separately. I want to know if you have tested a model that outputs html+bbox
tokens at the same time, with the data structure as follows:
- html only output:
["<td>", "</td>"]
- bbox only output:
["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
- html+bbox output:
["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]
The main points for doing this are:
- The inference efficiency can be improved.
- Hopefully, the correspondence between html tags and bbox will be more accurate.
I would love to know your thoughts on this, thank you.