deep-text-recognition-benchmark
deep-text-recognition-benchmark copied to clipboard
Training on Chinese data has problems with long text support
Below is the configuration file and training log. Looking forward to your reply.
@ku21fan
In theory, if the model can work well on short text recognition, it also recognizes text at good accuracy by separating your sequence. I think you should use window sliding technique to solve this problem
By the nature of CRNN, you can inference any length of the sentence as you wish. However, in the training stage the sample is resized to a fixed size, so if your training sample is too long, the resized image is blurred and cannot be distinguished, and the training accuracy can not be increased. So, please choose a reasonable length of your long text :)
@nguyenviettuan96 @zhtmike Thank you very much for your reply, but I don’t think this is the best way to solve the problem, I will continue to try other things.
@nguyenviettuan96 @zhtmike Thank you very much for your reply, but I don’t think this is the best way to solve the problem, I will continue to try other things.
hello Have you found a better solution?
in ctc loss if the output of the lstm(sequence length) lower than the label(grand truth) loss is infinity ... .for example the sequence length is 25 and your text is 50 then the loss is infinity and pytorch set it to zero(if zero infinity set to true) then the model cant learn the long text for solve the problem have 2 solution:
-
attention dont have this problem I train it with text of long 150 and have good result (change the input size to 46*320)
-
change the output shape of feature extraction ....default it is between 24 to 26 change the input size to 32*320 then feature map size is between 80 t0 84 if you need longer text add up sample at the end of model..
see pytorch ctc loss: https://pytorch.org/docs/stable/generated/torch.nn.CTCLoss.html