A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling
A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling copied to clipboard
Reproduce Results
Hi there,
Thank you for releasing your code. It helps a lot to understand the whole framework. I'd like to reproduce your results as shown in Table 1 in the paper. Can you provide the hyper-parameters you used to train the model?
When I ran this command
python train.py --data=cornell --model=VHCR --batch_size=40 --sentence_drop=0.25 --kl_annealing_iter=250000
This is the result I can got after training 30 epochs
python eval.py --model=VHCR --checkpoint=save/cornell/VHCR/2019-08-28_10\:16\:11/30.pkl
Word perplexity upperbound using 100 importance samples: 104.686, kl_div: 1.715
How can I get NLL 4.026 with KL 0.503?
Many thanks. Look forward to hearing from you soon.
Hi I got the same result as yours. Did you resolve it?
Thanks !!
Looking at the details provided here my guess is that the correct commands would be
python train.py --data=cornell --model=VHCR --batch_size=80 --sentence_drop=0.25 --kl_annealing_iter
=15000
python eval.py --data=cornell --model=VHCR --checkpoint=<path_to_your_checkpoint>
Also, notice that (1) The evaluation script prints out perplexity not NLL, and perplexity = exp(NLL). (2) The train/valid/test split is random.