unitable the fine-tuning of language in the content section

Unitable is a powerful recognition tool, but I want to train table content recognition that supports other languages. Have any good suggestions or opinions?

Jul 18 '24 10:07 num3num

I would suggest finetuning the OCR branch with the targeted language and UniTable should work out-of-the-box.

Jul 22 '24 14:07 ShengYun-Peng

In the recognition of the bbox section, there may be a large amount of text or gaps in a single bbox, which can lead to content loss or misalignment. Do you have any good suggestions for this situation? What model or debugging method is called for pre training or fine-tuning of unitable_1arge_bbox.pt?

Jul 29 '24 07:07 num3num

I would suggest finetuning the OCR branch with the targeted language and UniTable should work out-of-the-box.

请问您在微调的时候更改了哪些东西，有更改tokenizer嘛

Feb 01 '25 13:02 Anananana1568