AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Has anyone worked on doing inference using GeoLayoutLM? If so, can you please share the code for the same or provide guidance on how to do it?
i found sometimes this library not work very well, but most of the time are quite good. So i want to know if there is any solution that i can...
I downloaded 2 weights for finetining VGT. - [VGT-pretrain-model](https://github.com/AlibabaResearch/AdvancedLiterateMachinery/releases/download/v1.3.0-VGT-release/VGT_pretrain_model.pth) : it looks for pre-trained GiT - [dit_base_patch16_224](https://github.com/microsoft/unilm/tree/master/dit) : it looks for ViT(DiT) When finetuning VGT, Where should I specify the...
as title, thanks!
I would like to ask, in VGT, how to infer your own document images, that is, how to generate a png image into a pkl format Grid image
Hi, thanks for the great work! I recently came across a paper, _OMNIPARSER: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition_, mentioning that the code is...
`save_pkl_file` is throwing an exception in the main function because it only takes four arguments, however five are passed. Removing unnecessary `page` argument.
During the training process, at what level did the final loss stabilize? Can you publish your training log?
I want to provide a new file to SER and RE model and have relations extracted from the new file. How to run the pretrained SER and RE models for...
抱歉占用一下大佬的时间 我在复现 LORE 时,使用了作者提供脚本和权重文件,可以准确预测Cell Location,但是Logical Location和论文所 report 的有较大出入,logical location acc 为 0.23,论文 report 的值为0.86。 图中紫色底是逻辑位置预测值,无色底是ground truth,几乎都对不上。 请问一下该如何解决?