AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
python train_VGT.py --config-file Configs/cascade/D4LA_VGT_cascade_PTM.yaml --num-gpus 8 MODEL.WEIGHTS 'VGT_pretrain_model' OUTPUT_DIR 'D4LA_output' Traceback (most recent call last): File "train_VGT.py", line 20, in from detectron2.data import build_detection_train_loader File "/home/xia/anaconda3/envs/xia-vgt/lib/python3.8/site-packages/detectron2/data/__init__.py", line 4, in from...
`requirements.txt` in DocXChain is not enough to install necessary libraries. Also, I faced a minor error and inconvenience in usage. So I updated the files below. * `requirements.txt` * `example.py`...
Hi! I was wondering about the 'SceneVTG-benchmark' mentioned in the paper—where can I find it? I noticed that the SCUT-EnsText dataset only contains erased images, and I couldn’t locate the...
Hi, I want to know the time that Platypus weight model will release ?
Hi , I want fine tune the omniparser model on my custom key value extraction dataset . for that I need the pretrained omniparser_stage2.pth model . if anyone having that...
[PubTabNet-Dataset](https://github.com/ibm-aur-nlp/PubTabNet).LORE readme中说到的数据与readme中提供的标签,文件名对不上,能提供一下与标签对应的PubTabNet数据集吗,或者提供一个将PubTabNet转化为coco标签的脚本吗
您好!在阅读了你们LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition这篇文章后,我对于你们文章后半段对于中文数据集的处理比较感兴趣,但是目前开源的代码中貌似只有针对英文+数字的训练模型。所以想问一下你们是否用中文训练的模型以及训练代码呢?
First of all, thank you for making these models available—great work! I have tried several AI models that extract content from PDFs and identify its type—e.g., - text - title...
There's some bug in formula_recognition, where "LatexOCR" should be "LaTeXOCR"