Transformers-Tutorials icon indicating copy to clipboard operation
Transformers-Tutorials copied to clipboard

how to build a LiLT RobertaXML model with LayoutLMv3 tokenizer

Open MattBlue92 opened this issue 8 months ago • 0 comments

As title I want to understand how we can create LiLT RobertaXML model with LayoutLMv3 tokenizer.

The version SCUT-DLVCLab/lilt-roberta-en-base uses a LayoutLMv3 tokenizer, but the version of SCUT-DLVCLab/lilt-infoxlm-base don't use a roberta tokenizer.

So I want to discovery how to do that, I've already training a lilt model for italian using the official code https://github.com/jpWang/LiLT, but I've used the roberta tokenizer (italian version) and I'm pretty sure that if I try to replace the roberta tokenizer with the LayoutLMv3 tokenizer, the code will broken.

Have anyone tried to do that?

MattBlue92 avatar Jun 25 '24 16:06 MattBlue92