Tatiana Likhomanenko
Tatiana Likhomanenko
you can set specaug to be the same as warmup. I would start with warmup 16k, 32k, 64k, 80k to see if this helps. For the best then tune lr:...
this could depend, warmup actually is more to prevent blowing up, so if you see blowing up then just increase warmup
You need to wait more time to be sure there is no convergence. Your loss looks good, it goes down, blowing up I meant after warmup stage.
it is increasing during warmup, it is ok.
Yep, definitely seems that these parameters doesn't work well. I didn't try with 1 GPU. Could you try to set `--pretrainWindow` say 2000?
S2S is very sensitive and hard to train in general (to make it converge). I would still try the longer pre-train window then and also increase the warmup (at first...
cc @avidov @xuqiantong @vineelpratap are more familiar with inference pipeline, could you navigate here?
cc @vineelpratap @avidov @xuqiantong
You can simply run Test.cpp to see argmax output and its WER. So this is additional check on correct usage of decode.cpp. Decode.cpp should give you better result with proper...
Hey, You need to calculate what is the receptive field in your convolution network, so define which the future tokens / past tokens are used in the computations for particular...