MIT PATEL

Results 4 comments of MIT PATEL

Hey @nik13, check this out https://github.com/huggingface/transformers/issues/19190. I modified little bit in the inference part and worked for me.

I have created notebooks on the LayoutLM training and inference. It can handle whole image as image is divided into 512 tokens. [Notebook](https://github.com/mit1280/Document-AI/blob/main/FineTuning_LayoutLMv3_Trainer_HF_DocLayNet.ipynb)

Hi @nikhilKumarMarepally, please check https://github.com/mit1280/Document-AI/blob/main/LayoutLMv3_Inference.ipynb you need to stack "input_ids", "attention_mask", "bbox". All are in list so first convert to tensor and then stack it. This will resolve issue.

Hi @nikhilKumarMarepally, for LayoutLmv3 training you need page text, bounding box - coordination and label and image. If you have data like https://guillaumejaume.github.io/FUNSD/ then you can use LayoutLM else please...