transformer-xl {BUG} Model and para

{BUG} Model and para_model Semantic error !

Open agemagician opened this issue 5 years ago • 0 comments

In pytorch implementation there is a mix between para_model and model. Should not you use only para_model?

For example in training function line number 422 you used "model.zero_grad()" but afterward you used in line number 436 "ret = para_model(data_i, target_i, *mems[i])".

Should not the whole program use para_model ??

May 16 '19 10:05 agemagician

transformer-xl transformer-xl copied to clipboard

{BUG} Model and para_model Semantic error !

transformer-xl
transformer-xl copied to clipboard