jukebox
jukebox copied to clipboard
OOM using Colab
Over the past few weeks I've been unable to run the colab which I had been running successfully for months. I'm getting this for an assigned machine
GPU 0: Tesla P100-PCIE-16GB
Last thing to run before crashing due to running out of memory is...
Downloading from azure
Running wget -O /root/.cache/jukebox/models/5b/vqvae.pth.tar https://openaipublic.azureedge.net/jukebox/models/5b/vqvae.pth.tar
Restored from /root/.cache/jukebox/models/5b/vqvae.pth.tar
0: Loading vqvae in eval mode
From the logs
Apr 24, 2021, 11:18:32 AM | WARNING | WARNING:root:kernel 665437f1-8066-43a5-94ea-0304ed2d78bb restarted
-- | -- | --
Apr 24, 2021, 11:18:32 AM | INFO | KernelRestarter: restarting kernel (1/5), keep random ports
Apr 24, 2021, 11:18:04 AM | WARNING | 2021-04-24 15:18:04 (45.4 MB/s) - ‘/root/.cache/jukebox/models/5b/vqvae.pth.tar’ saved [7726329/7726329]
The cell that does crash has this in it, which I imagine is the culprit
[vqvae, *priors = MODELS[model]
vqvae = make_vqvae(setup_hparams(vqvae, dict(sample_length = 1048576)), device)
top_prior = make_prior(setup_hparams(priors[-1], dict()), vqvae, device)](url)
Has something changed?
I am also running into this issue. I tried decreasing the batch_size and the chunk_size but it is still happening. Any help?
I am not getting until the batch, I have an out of memory in colab while importing libraries in the third cells of notebook