nade_k
nade_k copied to clipboard
An iterative neural autoregressive distribution estimator (NADE-K)
This package contains the accompany code for the following paper:
Tapani Raiko, Li Yao, KyungHyun Cho, Yoshua Bengio
Iterative Neural Autoregressive Distribution Estimator (NADE-k).
Advances in Neural Information Processing Systems 2014 (NIPS14).
Setup
Install Theano
Download Theano and make sure it's working properly.
All the information you need can be found by following this link:
http://deeplearning.net/software/theano/
Make sure theano is added into your PYTHONPATH.
Install Jobman
Very detailed information can be found below:
http://deeplearning.net/software/jobman/install.html.
Make sure jobman is added into your PYTHONPATH.
Prepare the MNIST dataset
You can download the dataset from the links below.
[trainset]
(http://www.cs.toronto.edu/~larocheh/public/datasets/binarized_mnist/binarized_mnist_train.amat)
validset
testset
After the dataset has been downloaded, make sure to change the data_path in utils.py.
Reproducing the Results
Train the model
- Change
exp_pathinconfig.py. This is the directory where all the training outputs are going to be placed. For different experiments, one needs to specify'save_model_path'in the same config file. - To run NADE-5 1HL in Table 1 of the paper, make sure
'n_layers': 1,and'l2': 0.0. - To run NADE-5 2HL in Table 1 of the paper, make sure
'n_layers': 2,and'l2': 0.0012279827881. - To start training,
python train_model.py
It is highly recommended the code is run on GPUs. For how to make it happen, take a look at this place: http://deeplearning.net/software/theano/tutorial/using_gpu.html.
Training outputs
During the training, lots of information is printed out on the screen, and many files are written to the save_mode_path. You will be able to see the plot of drop of the training cost, the generated samples from the model, the log-likelihood on the validset and testset every valid_freq epochs.
If you use the default setup, the model will be pretrained for 1000 epochs, and finetuned for another 3000 epochs. To have a good generative model, one need to be patient :)
In addition, we have provided some training logs with which you should be able to match your experiments with. See in the directory results.
Evaluation
After training is done, it is time to get all those SOTA numbers in Table 1 of the paper.
- In
config.py, change the option'action'to 1. Meanwhile make sure'from_path'points to the directory that containsmodel_params_e*.pklandmodel_configs.pkl. The option'epoch'specify which model over there you would like to use. - Then
python train_model.py - If all goes well, the evaluation script should be able to produce numbers that match those in the paper.
IMPORTANT: You probably will be surprised when you see better numbers than those reported in our paper. Calm down and we know this could happen. The longer you train our model, the more likely you will get better numbers. And do spread your joy to us when this happens.
Benchmarks with this package
NADE-5 1H model:
testset LL over 10 orderings = -89.43
testset LL over 128 ensembles = -85.77
Those numbers are better than what we used in the paper because the model is trained much longer here.
NADE-5 2H model:
testset LL over 10 orderings = -87.13
testset LL over 128 ensembles = -84.65
Contact
Questions?
Need a trained model?
Contact us: [email protected]