world-models icon indicating copy to clipboard operation
world-models copied to clipboard

Error training MD-rnn

Open waiyc opened this issue 5 years ago • 3 comments

Hi,

I am facing this invalid input size error when training dmrnn.

File "trainmdrnn.py", line 205, in test_loss = test(e) File "trainmdrnn.py", line 170, in data_pass latent_obs, latent_next_obs = to_latent(obs, next_obs) File "trainmdrnn.py", line 108, in to_latent [(obs_mu, obs_logsigma), (next_obs_mu, next_obs_logsigma)]] File "trainmdrnn.py", line 107, in for x_mu, x_logsigma in RuntimeError: shape '[16, 32, 32]' is invalid for input of size 11264

waiyc avatar May 16 '19 01:05 waiyc

Hi, we're also experiencing this issue (but with a different size):

  File "trainmdrnn.py", line 202, in <module>
    train(e)
  File "trainmdrnn.py", line 168, in data_pass
    latent_obs, latent_next_obs = to_latent(obs, next_obs)
  File "trainmdrnn.py", line 106, in to_latent
    [(obs_mu, obs_logsigma), (next_obs_mu, next_obs_logsigma)]]
  File "trainmdrnn.py", line 105, in <listcomp>
    for x_mu, x_logsigma in
RuntimeError: shape '[16, 32, 32]' is invalid for input of size 36864```

megsano avatar May 23 '19 03:05 megsano

Never mind, we just solved the problem by changing the SIZE in utils/misc.py to 96 instead of 64.

megsano avatar May 23 '19 03:05 megsano

@waiyc, you'll want to add drop_last=True as an argument to the DataLoader. This resolved the problem for me, as the difference in batch sizes caused the error.

https://github.com/ctallec/world-models/blob/master/trainmdrnn.py#L79

wildermuthn avatar Jun 19 '19 01:06 wildermuthn