unitable icon indicating copy to clipboard operation
unitable copied to clipboard

Questions about joint training with html and bbox.

Open Sanster opened this issue 8 months ago • 2 comments

unitable uses two different models to predict html tokens and bbox tokens separately. I want to know if you have tested a model that outputs html+bbox tokens at the same time, with the data structure as follows:

  • html only output: ["<td>", "</td>"]
  • bbox only output: ["bbox-1", "bbox-2", "bbox-3", "bbox-4"]
  • html+bbox output: ["<td>", "bbox-1", "bbox-2", "bbox-3", "bbox-4", "</td>", "<td></td>"]

The main points for doing this are:

  1. The inference efficiency can be improved.
  2. Hopefully, the correspondence between html tags and bbox will be more accurate.

I would love to know your thoughts on this, thank you.

Sanster avatar Jun 19 '24 09:06 Sanster