unconditional-time-series-diffusion icon indicating copy to clipboard operation
unconditional-time-series-diffusion copied to clipboard

Error when training model

Open jm02058 opened this issue 1 year ago • 2 comments

I am trying to train the model (currently using provided example 'python bin/train_model.py -c configs/train_tsdiff/train_uber_tlc.yaml'), but am struggling to fix the following error:

File "/.../src/uncond_ts_diff/arch/s4.py", line 967, in forward k = torch.fft.irfft(k_f, n=discrete_L) # (B+1, C, H, L) RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

Any guidance on how I might have caused this would be much appreciated, I am not very experienced in this area. Thank you. I have also provided a screenshot of the last output (I am running the model on WSL).

Screenshot 2024-07-07 114209

jm02058 avatar Jul 07 '24 10:07 jm02058

@jm02058 Is this relevant? https://github.com/pytorch/pytorch/issues/88038

abdulfatir avatar Jul 29 '24 14:07 abdulfatir

@jm02058 Is this relevant? pytorch/pytorch#88038

Yes, I'll have another look. Thank you.

jm02058 avatar Jul 30 '24 11:07 jm02058