vmf_vae_nlp icon indicating copy to clipboard operation
vmf_vae_nlp copied to clipboard

Hyperparams for results reported in the paper

Open pclucas14 opened this issue 5 years ago • 4 comments

Hello,

is it possible to have hyperparameters values that reproduce the NVRNN and LM results on the PTB dataset ?

Many thanks, Lucas

pclucas14 avatar Oct 11 '18 15:10 pclucas14

Hi Lucas, thanks for your interest. I am sorry that I didn’t put all the commands on the github page because there are too many tables ( and it’s hard to keep track of all the configuration of all the results). Here are some of my intuition about training PTB: 1) large learning rate with decay (--lr 10 for example) 2) train longer with sgd (--epochs 100) 3) gradient clip and dropout. This repo provides an amazing configuration of training a LM. PyTorch example of word language model.

I listed the instance name of saved results. Possibly you can reproduce the results given the hyper-parameters. Detailed Configuration: Note: zero=lm; nor=gaussian; vmf=von mises-fisher Standard Setting:

  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa35.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa5.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow200

Inputless Setting Condition on Bag-of-words:

  • Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
  • Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa50.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200

Not Condition:

  • Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa80.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0

jiacheng-xu avatar Oct 11 '18 18:10 jiacheng-xu

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

pclucas14 avatar Oct 11 '18 18:10 pclucas14

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

The configuration of the word language model example will help.

jiacheng-xu avatar Oct 11 '18 18:10 jiacheng-xu

great! thanks again

pclucas14 avatar Oct 11 '18 18:10 pclucas14