ml5-data-and-models icon indicating copy to clipboard operation
ml5-data-and-models copied to clipboard

LSTM train.py not loading from checkpoint

Open karooolis opened this issue 6 years ago • 3 comments

Hi, I try to start training a model from a checkpoint. I use init_from option but somehow the model still starts training from 0. What could be the problem here?

The sample command looks like this - python train.py --data_dir=/home/ubuntu/ml5-data-and-training/datasets/text/famous_quotes --rnn_size 256 --num_layers 2 --seq_length 64 --batch_size 32 --output_keep_prob 0.25 --init_from=/home/ubuntu/ml5-data-and-training/training/lstm/checkpoints/famous_quotes.

karooolis avatar Jun 18 '18 14:06 karooolis

Was the checkpoint generated from the same script?

cvalenzuela avatar Jun 20 '18 00:06 cvalenzuela

I must have not pressed the comment button when I commented last time. Yes, the checkpoint was generated with exactly the same script. Have you heard of anyone else having similar issues?

karooolis avatar Jul 03 '18 11:07 karooolis

Not sure what could be the issue. A new repo with some fixes was recently published here: https://github.com/ml5js/training-lstm, you could try that insted

cvalenzuela avatar Jul 03 '18 14:07 cvalenzuela