VisionLAN icon indicating copy to clipboard operation
VisionLAN copied to clipboard

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

Results 18 VisionLAN issues
Sort by recently updated
recently updated
newest added

你好,这个地方 https://github.com/wangyuxin87/VisionLAN/blob/main/train_LF_2.py#L74-L76 通过id来选择不同参数,我通过print打印,第一组参数是空,也就是没有匹配到id_total的参数,是哪里有问题吗?

感谢您的代码,运行“CUDA_VISIBLE_DEVICES=0 python eval.py”命令后,报出如下错误: Traceback (most recent call last): File "eval.py", line 11, in import cfgs.cfgs_eval as cfgs File "/workspace/VisionLAN-main/cfgs/cfgs_eval.py", line 5, in from data.dataset_scene import * File "/workspace/VisionLAN-main/data/dataset_scene.py", line 16,...

utils.py中的cha_encdec类在编码时将字典中不存在的符号编码为len(self.dict)+1,这将导致训练程序中的crossentropyloss函数报错

![image](https://user-images.githubusercontent.com/93702453/147028392-176a4613-4aa3-45a6-a314-8047367243bf.png) 单字符label不需要语义信息,但是也会经过这个函数,采样的change_id=0,imput_lable = imput_lable[:change_id] 责imput_lable=‘ ’,即像这样 ![image](https://user-images.githubusercontent.com/93702453/147029454-f05d187f-9c4d-4a7f-8149-8c935bcfc75d.png) 在训练过程中会发生错误 ![image](https://user-images.githubusercontent.com/93702453/147029500-44f5d263-55c7-4938-89b7-97d4c5e5e142.png)

![image](https://user-images.githubusercontent.com/93702453/146728922-067a29d8-25c0-4c53-ab29-d7197cd25001.png) 你好我用了readme给的测试集,但是我发现评测效果并没有论文表述的一样,请问一下我哪里操作有问题吗

Thanks for your excellent contributions! I try to use your pre-trained LF_2 model to visualize the mask map, I pick the same image that was shown in the Visualization character-wise...

Dear yuxin, sorry to bother you again. When I use your code, I found two new questions: 1. When I executed `python train_LF_1.py`, I got a CUDA error in `ClassNLLCriterion.cu`....

I wonder how you occlude the characters of the image? Which tools? Can you Please explain?