practical_seq2seq icon indicating copy to clipboard operation
practical_seq2seq copied to clipboard

Training Time ?

Open prakhar21 opened this issue 7 years ago • 3 comments

Hi, I am trying to train a Q/A model on my data, using your sequence wrapper. The checkpoint model that you have given is the result of 45k epoch roughly. How much time did it take to train this model ? Also, by what heuristics did you reach epoch at this scale ?

prakhar21 avatar Feb 20 '17 08:02 prakhar21

I stopped when the validation loss saturated. It took roughly 7-8 hours to run ~45k iterations in GTX960 (i5 processor).

suriyadeepan avatar Feb 20 '17 08:02 suriyadeepan

ok. The modeling approach you have described in your blog and code. This is more or less generalized, right? and should work for any Q/A dataset.

prakhar21 avatar Feb 20 '17 08:02 prakhar21

I am re-training on the dataset with same the code, val. score loss is increasing significantly after 2000 epoch. At 2000 it was 3.12. Why is it diverging ?

prakhar21 avatar Feb 20 '17 16:02 prakhar21