unitable Questions about joint training with html and bbox.

Questions about joint training with html and bbox.

Open Sanster opened this issue 8 months ago • 2 comments

unitable uses two different models to predict html tokens and bbox tokens separately. I want to know if you have tested a model that outputs html+bbox tokens at the same time, with the data structure as follows:

html only output: ["<td>", "</td>"]
bbox only output: ["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
html+bbox output: ["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]

The main points for doing this are:

The inference efficiency can be improved.
Hopefully, the correspondence between html tags and bbox will be more accurate.

I would love to know your thoughts on this, thank you.

Jun 19 '24 09:06 Sanster

unitable unitable copied to clipboard

Questions about joint training with html and bbox.

unitable
unitable copied to clipboard