Yupan Huang comments

Results 50 comments of


                                            Yupan Huang

LayoutLMv3 | Domain adaptation on the base model

You may refer to LayoutLMv3's paper and BEiT's code for image masking (the Masked Image Modeling objective).

LayoutLMv3 DocVQA: How was training kept stable for 100k, 200k steps (over 300 epochs!)

Thank you for the question and detailed description. There is no necessary relationship between evaluation metrics and evaluation loss (see [some explanations](https://datascience.stackexchange.com/questions/42599/what-is-the-relationship-between-the-accuracy-and-the-loss-in-deep-learning)). For example, when [fine-tuning FUNSD](https://huggingface.co/HYPJUDY/layoutlmv3-base-finetuned-funsd/tensorboard), the evaluation loss...

LayoutLMv3 DocVQA: How was training kept stable for 100k, 200k steps (over 300 epochs!)

I think the performance would be worse using the less accurate OCR (the one provided). I do not have a quantitative comparison because I did not experiment with this.

LayoutLMv3 examples for CORD

Thank you for your interest! According to the error message, something seemed wrong with loading the CORD dataset. Could you check if the dataset has been downloaded successfully? The code...

LayoutLMv3 document layout train error

It is hard to locate the cause of errors and debug without error stack traces. Have you set `IMS_PER_BATCH` to a multiple of your GPU size? For example, if you...

why not use ocr result in Document Layout training

We have some explanations in the paper's Section 3.4: > To demonstrate the generality of LayoutLMv3 from the multimodal domain to the visual domain, we transfer LayoutLMv3 to a document...

Is there a simple method can directly use layoutlmv3 and test some image samples?

The bbox information is a necessary input for layoutlm models. But you can skip the manual input of bbox and extract it automatically with an OCR processor. @NielsRogge provides two...

Is there a simple method can directly use layoutlmv3 and test some image samples?

Many spaces use the LayoutLMv3 models. For example, you are welcome to explore more on [Spaces using microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base). I am closing this issue for now, since it is inactive.

LayoutLmv3 | ConnectionError: Couldn't reach https://raw.githubusercontent.com/huggingface/datasets/2.4.0/metrics/seqeval/seqeval.py

Have you solved the problem? It looks like a network issue. Can you reach the address on your server (e.g., through `wget https://raw.githubusercontent.com/huggingface/datasets/2.4.0/metrics/seqeval/seqeval.py`)?

LayoutLmv3 | ConnectionError: Couldn't reach https://raw.githubusercontent.com/huggingface/datasets/2.4.0/metrics/seqeval/seqeval.py

I am closing this issue for now, since it is inactive.