unilm icon indicating copy to clipboard operation
unilm copied to clipboard

How layoutlmv3 makes train at the character level

Open 413542484 opened this issue 3 years ago • 1 comments

Describe Model I am using (LayoutLMv3): The model train entities for each bbox, but my entity is a part of the text in a bbox, and there will be multiple entities in a bbox. How should I deal with this situation?

413542484 avatar Jul 26 '22 12:07 413542484

Sounds like you should look at the transformers implementation of LayoutLMv3.

https://huggingface.co/docs/transformers/main/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForTokenClassification

When doing token classification it will predict a label for each token based on the token and the bboxes of each token supplied via the OCR step.

You can then come up with a step to aggregate the predicted labels to groups that suit your need.

jordanparker6 avatar Jul 28 '22 11:07 jordanparker6