Results 4 comments of Hann Wang

@sxjscience for profiling, we use the default python cProfiler while running the training script and [snakeviz](https://jiffyclub.github.io/snakeviz/) to visualize the breakdown. We are currently using 0.8.x of gluon-nlp, so maybe that's...

Hi @marhlder, would you mind explaining a bit more how this works? Just from the code, I do not quite understand how this would reduce the memory requirement. Since in...

I did not try to reproduce the PTB results, but I've tried it on some other problems. The model just converges as expected. You might want to look into other...

oops, my bad, I skim through the email and thought this is an issue in another repo. To answer your question, just let the model train for longer, it appears...