Amit Dovev
Amit Dovev
Is there a problem with registering chars in the model's codec at build time (first time only), even if some of them won't be trained? For example, for Chinese -...
> Is this what you're suggesting, @amitdo? I missed that sentence. My answer: Certainly not!
My suggested solution: The user will have an option to point to a file which will contain all the chars he think he will ever need for a specific model....
The issue is mostly with Chinese and Japanese.
Training both Chinese and Japanese in the same model is not a good idea.
@striversist, I decided not to implement what I suggested before. It seems not to be such a good idea after all. Sorry.
@mittagessen what about this one: https://github.com/baidu-research/warp-ctc ?
warp-ctc used with LSTM https://github.com/dmlc/mxnet/tree/master/example/warpctc
@jbaiter What about basing your cython binding on the older matrix based code?
Try to find out where that 0.5 comes from. Maybe the errors are mostly with dot, comma, and spaces.