AliceMind DocVQA reproduce problem using StructuralLM

I tried to finetune StructuralLM on DocVQA dataset using the released weights, but I only get 76.85 ANLS on the test set. Can the finetuning code on DocVQA be open-sourced ?

Jul 22 '21 08:07 Cppowboy

Thanks for the attention. Actually, we used some optimization techniques the same as layoutlmv2. You can refer to the paper. At the same time, based on the StructuralLM model, we still do some continue pre-training on the DocVQA data, mainly to add 2D-position on the question. This can refer to the method of the champion in the CVPR‘20 challenge. We will consider making this code and model open-source in the future.

Jul 28 '21 06:07 lcl6679292

Thanks for the attention. Actually, we used some optimization techniques the same as layoutlmv2. You can refer to the paper. At the same time, based on the StructuralLM model, we still do some continue pre-training on the DocVQA data, mainly to add 2D-position on the question. This can refer to the method of the champion in the CVPR‘20 challenge. We will consider making this code and model open-source in the future.

I have the same problem with @Cppowboy, using the released weight can not reach 83.94. Without any trick, how many anls can DocVQA reach?

Aug 04 '21 03:08 paulpaul91

Thanks for the attention. Actually, we used some optimization techniques the same as layoutlmv2. You can refer to the paper. At the same time, based on the StructuralLM model, we still do some continue pre-training on the DocVQA data, mainly to add 2D-position on the question. This can refer to the method of the champion in the CVPR‘20 challenge. We will consider making this code and model open-source in the future.

I have the same problem with @Cppowboy, using the released weight can not reach 83.94. Without any trick, how many anls can DocVQA reach?

Thanks for the attention. Just using the released weight can reach 78+ ANLS on the test set with some post-processing, which is always used on the data set. As we mentioned above, we will consider making this continue pre-training code and model open-source in the future.

Aug 04 '21 07:08 lcl6679292

Thanks for the attention. Actually, we used some optimization techniques the same as layoutlmv2. You can refer to the paper. At the same time, based on the StructuralLM model, we still do some continue pre-training on the DocVQA data, mainly to add 2D-position on the question. This can refer to the method of the champion in the CVPR‘20 challenge. We will consider making this code and model open-source in the future.

I have the same problem with @Cppowboy, using the released weight can not reach 83.94. Without any trick, how many anls can DocVQA reach?

Thanks for the attention. Just using the released weight can reach 78+ ANLS on the test set with some post-processing, which is always used on the data set. As we mentioned above, we will consider making this continue pre-training code and model open-source in the future.

thanks

Aug 06 '21 09:08 paulpaul91

Thanks for the attention. Actually, we used some optimization techniques the same as layoutlmv2. You can refer to the paper. At the same time, based on the StructuralLM model, we still do some continue pre-training on the DocVQA data, mainly to add 2D-position on the question. This can refer to the method of the champion in the CVPR‘20 challenge. We will consider making this code and model open-source in the future.

I have the same problem with @Cppowboy, using the released weight can not reach 83.94. Without any trick, how many anls can DocVQA reach?

Thanks for the attention. Just using the released weight can reach 78+ ANLS on the test set with some post-processing, which is always used on the data set. As we mentioned above, we will consider making this continue pre-training code and model open-source in the future.

Can you briefly introduce the continue pre-training and QG？How much benefit can bring for each？

Aug 06 '21 14:08 paulpaul91

The continue pre-training on the DocVQA set can bring about 2.0+ ANLS. train set and validation set. QG can bring about 2.4+ANLS. In addition, merge the train set and dev set can also bring 1.8+ANLS. Tips, the results on the test set are greatly affected by the parameters, which may lead to a difference of 1+ANLS

Aug 09 '21 03:08 lcl6679292

The continue pre-training on the DocVQA set can bring about 2.0+ ANLS. train set and validation set. QG can bring about 2.4+ANLS. In addition, merge the train set and dev set can also bring 1.8+ANLS. Tips, the results on the test set are greatly affected by the parameters, which may lead to a difference of 1+ANLS

How much data is used for the continue pre-training？how much data for QG？

Aug 09 '21 13:08 paulpaul91

The continue pre-training on the DocVQA set can bring about 2.0+ ANLS. train set and validation set. QG can bring about 2.4+ANLS. In addition, merge the train set and dev set can also bring 1.8+ANLS. Tips, the results on the test set are greatly affected by the parameters, which may lead to a difference of 1+ANLS

How much data is used for the continue pre-training？how much data for QG？

The data for continue pre-training is all the DocVQA data set, and the data for QG is more than one million.

Aug 17 '21 03:08 lcl6679292

The continue pre-training on the DocVQA set can bring about 2.0+ ANLS. train set and validation set. QG can bring about 2.4+ANLS. In addition, merge the train set and dev set can also bring 1.8+ANLS. Tips, the results on the test set are greatly affected by the parameters, which may lead to a difference of 1+ANLS

How much data is used for the continue pre-training？how much data for QG？

The data for continue pre-training is all the DocVQA data set, and the data for QG is more than one million.

想问下模型83.94结果是否有使用十折

Sep 29 '21 13:09 paulpaul91

AliceMind AliceMind copied to clipboard

DocVQA reproduce problem using StructuralLM

AliceMind
AliceMind copied to clipboard