Yupan Huang comments

Results 50 comments of


                                            Yupan Huang

Inference code for V3 without hugging face.

It is hard to locate the cause of errors and debug without error stack traces. It would be helpful to provide more information about your running command, your inference task,...

Inference code for V3 without hugging face.

According to the message of ``` attention_scores = attention_scores + attention_mask RuntimeError: The size of tensor a (397) must match the size of tensor b (200) at non-singleton dimension 3...

Unclear input data structure of layoutreader

Please follow [the instructions](https://github.com/microsoft/unilm/blob/master/layoutreader/README.md#run). Specifically, you can download the data refer to step 1 (`wget https://layoutlm.blob.core.windows.net/readingbank/dataset/ReadingBank.zip`), extract the data (`unzip ReadingBank.zip` and you should get the test data `ReadingBank/test`), and...

Issue in Object Detection using LayoutLMV3

1. `ModuleNotFoundError: No module named 'layoutlmft'`: please try `pip install -e .` following the [installation instruction](https://github.com/microsoft/unilm/tree/master/layoutlmv3#installation). 2. `Not Found for url...`: please manually download `model_final.pth` from `https://huggingface.co/HYPJUDY/layoutlmv3-base-finetuned-publaynet/` to your local...

Issue in Object Detection using LayoutLMV3

I am not sure how to "get directly the text representation besides the tensor information related to bounding boxes", but it is easy to get text segments/lines by OCR engines...

LayoutLMv3 | Index Error While Training On Custom Dataset

Hi, has your problem been solved? Have you run the example code to see if training the model on PubLayNet with GPUs works? I have not tried training with CPU...

LayoutLMv3: IndexError: index out of range in self on some inputs

This problem seems to be caused by some position_ids being larger than the embedding size. I suggest you find the exact sample causing this problem and analyze its minimum and...

LayoutLMv3 | Object Detection & Huggingface Transformers

Currently, LayoutLMv3 in Transformers does not support object detection ([see @NielsRogge's reply below](https://github.com/huggingface/transformers/pull/17060#issuecomment-1132626756)). > unfortunately I'm (for now) not planning to add the object detection part, because the framework being...

LayoutLMv3 | Object Detection & Huggingface Transformers

`MODEL.IMAGE_ONLY: True` means only image (but not text) information is used. See also: https://github.com/microsoft/unilm/issues/813#issuecomment-1210045982

LayoutLMv3 | Object Detection & Huggingface Transformers

1. I am not aware of such models. I think that with proper design, the inclusion of both inputs might improve the results. You can try it if interested. 2....