latr icon indicating copy to clipboard operation
latr copied to clipboard

Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answering (STVQA)

Results 10 latr issues
Sort by recently updated
recently updated
newest added

I think it should be if use_ocr == True: entries = apply_ocr(img_path) bounding_box = entries["bbox"] words = entries["words"] bounding_box = list(map(lambda x: resize_align_bbox(x,width_old,height_old, width, height), bounding_box)) the line : bounding_box...

Hi, in `calculate_acc_score` function it seems, you calculate evaluation with only sum and average for accuracy_score in python but in fact for TextVQA maybe you should calculate evaluation with below...

Hi Mr. @uakarsh As you know I am working your source code and I am trying to evaluate by ANLS like write in article. and I want to do that...

Thanks for your implementation. Have you tried TextVQA training without the layout-aware pre-training? Can you reproduce the results of the paper? E.g., LaTr-base achieves 44.06 on Rosetta-en and 52.29 on...

Hi, in `apply_mask_on_token_bbox` method it looks like you are masking only first token of the span then loose the bbox of whole span. Why not masking all tokens in the...

Hi, when I am trying to train your source code in 5'th epoch (maybe or less maybe or more) I encountered error that stop training. so I increase max_step ....

Hi Mr. @uakarsh In one of your source code files named "LaTr TextVQA Training with WandB 💥.ipynb" I was worked on it. Now I want to predict samples I use...

This thread contains the discussion of the implementation of LaTr with one of the authors of the same paper The earlier discussion with the first author is mentioned [here](https://github.com/uakarsh/latr/issues/2#issuecomment-1153231321)

Hi. Thanks for great code. First of all , I am so sorry if my questions are very simple and basic. In the continue of checking your code I encounter...

hello, there is a Runtime error in the hugging face demo. can you fix it? thank you! https://huggingface.co/spaces/iakarshu/latr-vqa