FOTS_TF
FOTS_TF copied to clipboard
get InvalidArgumentError (see above for traceback): Not enough time for target transition sequence (required: 17, available: 11)14You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs in tf.nn.ctc_loss
get InvalidArgumentError (see above for traceback): Not enough time for target transition sequence (required: 17, available: 11)14You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs in tf.nn.ctc_loss
Input feature's width to CTC is shorter than the label length, eg. the time step is 4 but the label length is 5. You can check the input data when you are training your own datasets.
sorry, I did not get your point: I convert official RCTW2017 data to training data according to the format of file in training_samples in your respository and start running multigpu_train.py for chinese character recognition , then get above error outputs could you give some explicit advice or where should I check by the way, I have solved the coding of chinese character and maped the chinese character to number
呃 是我的英文太糟糕了 我的意思是说你的一个文字区域文本的长度可能长于LSTM时间步的长度(也就是最后的特征图的宽度),所以会出现这个错误。你可以检查一下label的长度。就是要保证label的长度不大于RoI的宽度。或者直接把ignore_longer_outputs_than_inputs设置为True也可以。
请问 icdar.py 文件里的width_box = int(min(width_box, 128)) # not to exceed feature map's width 不是已经限制了? 把width_box 设置大些, recognize_part.loss(recognition_logits, input_transcription, input_box_widths)这里保证input_transcription < input_box_widths, 即保证 max(labels.indices(labels.indices[:, 1] == b, 2)) <= sequence_length(b) for all b. 这样理解对吗? 但设置 width_box = int(min(width_box, 512))会报错:assertion failed:[width must be >= target + offset.]
呃 我没太看懂你的意思,width_box是根据长宽比计算出来的,固定RoI的高为8,之后算出来宽度,而128是特征图的宽度,所以box的宽度不能大于它。
thank you, setting ignore_longer_outputs_than_inputs=True works for me but I'm really not sure if it does any harm or not.
特征图的宽度不是512吗
输入图片的大小是512,经过backbone后是128
谢谢啦