AdvancedLiterateMachinery issues

GeoLayoutLM reproduction of the CORD dataset

Has anyone reproduced the CORD dataset using GeoLayoutLM? Could we discuss it?

GeoLayoutLM: Warning about mismatched tokenizer class: BrosTokenizer vs BertTokenizer

1

Hi, I encountered this warning when running the code: The tokenizer class you load from this checkpoint is not the same type as the class this function is called from....

vodatvan01

DocHieNet Model and dataset not found

3

The links of dataset and model weights in the README of DocHieNet are pointed to the page itself. And I search in modelscope with keywords DocHieNet , got nothing. So,...

qianyue76

Code for omniparser V2

2

Thanks for the great work, this is super interesting and really useful! Just wondering, is the Omniparser v2 code already in the repo, or is it still on the way?...

cooleel

我使用如下命令进行多卡分布式训练： CUDA_VISIBLE_DEVICES=5,6 python -m torch.distributed.run \ main.py \ --data_root ./text_spotting_datasets/ \ --output_folder ./output/pretrain/stage1/ \ --train_dataset totaltext_train mlt_train ic13_train ic15_train syntext1_train syntext2_train \ --lr 0.0005 \ --max_steps 400000 \ --warmup_steps 5000...

zlf0307

Omniparser表格识别部分代码未公布？

5

Omniparser是一篇很有用的工作，但是我在复现时，在训练和推理的代码中都没有都没有找到有关表格识别的部分。请问这部分代码是还未开源呢？还是因为我的疏漏没有找到相关代码？

WaiBiBaBolmc

nhatet1ht

AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard

Metadata

GeoLayoutLM reproduction of the CORD dataset

GeoLayoutLM: Warning about mismatched tokenizer class: BrosTokenizer vs BertTokenizer

DocHieNet Model and dataset not found

Code for omniparser V2

Omni如何进行分布式训练？

Omniparser表格识别部分代码未公布？

How to get table data from VGT inference?

Image recognition loses text in it

Potential typo in DocXChain/modules/formula_recognition.py

Why are my trained weights 5.0GB, while the pre-trained weights are only 1.7GB?

← Metadata

Owner

Metadata

AdvancedLiterateMachinery AdvancedLiterateMachinery copied to clipboard

Metadata

← Metadata

Owner

Metadata

AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard