parseq
parseq copied to clipboard
Sometimes predict long, redundancy and repetitive chars, like 'parseqqqqqqqqqqqq'(ground truth is 'parseq')
Does any chance you know the reason? Thank you for your talented work!
Hello, please provide more details such as the model and weights used as well as the exact image you're using.
Hello, please provide more details such as the model and weights used as well as the exact image you're using.
I used the parseq trained on Chinese dataset, which contains about 6K chars, and used it for test dataset inference.
Sorry but I can't help you since:
- I have no access to and am not familiar with the specific model you're referring to.
- I have no access to and am not familiar with the data you're using.
- PARSeq was developed and tested with Latin characters on primarily English text. I am not familiar with the intricacies of Chinese text.
You might want to try increasing the number of decoder layers, or using a larger version of the model since the Chinese charset is much bigger than the Latin one.
So I tinkered a lot and this is perhaps due to 'label_length' in main.yaml and the image size you are giving to the model. In my case, with the model on hugging face; if you input an image with a single word parseq followed by white-spaces that are equivalent to 30-35 characters in total, the result is correct.
However, if we exceed this and input an image with length of lets say beyond 40 characters, redundant repetition of characters is seen.
gave me gatery.comminFreedom
gave me gateway.................
gave me gateway.
It is probably due to the face that model was trained on 1-word images and will hallucinate for longer labels. I trained a model with label-length set to 65 and it was able to overcome this problem.
Thanks. I'll try your sollution.