diffindscene icon indicating copy to clipboard operation
diffindscene copied to clipboard

CUDA out of memory

Open kimyoungji opened this issue 1 year ago • 2 comments

I run unconditioned inference as,

$python main/test.py --cfg_dir utils/config/samples/cascaded_ldm

I got following error

Traceback (most recent call last): File "main/test.py", line 58, in trainer.run_test() File "/home/diffindscene/trainer/cascaded_ldm_trainer.py", line 244, in run_test self.run_test_uncond() File "/home/diffindscene/trainer/cascaded_ldm_trainer.py", line 306, in run_test_uncond outdata_dict = model.restoration(data) File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/diffindscene/model/ms_ldm/multiscale_latent_diffusion.py", line 451, in restoration occ_l2 = self.decode_occ_lv2(result, occ_l1, quant2_voxel) File "/home/diffindscene/model/ms_ldm/multiscale_latent_diffusion.py", line 340, in decode_occ_lv2 quant1, diff_b, id_b = self.first_stage_module.quantize1(quant1_) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/diffindscene/model/auto_encoder/ms_vqgan/quantize.py", line 102, in forward soft_one_hot = F.gumbel_softmax(logits, tau=temp, dim=1, hard=hard) File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 1902, in gumbel_softmax ret = y_hard - y_soft.detach() + y_soft torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 23.65 GiB total capacity; 20.89 GiB already allocated; 1.23 GiB free; 21.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

kimyoungji avatar Dec 02 '24 08:12 kimyoungji

Hi, @kimyoungji did you figure out how to deal with the memory allocation issues?

ronebrandao avatar Apr 08 '25 23:04 ronebrandao

I have the same question,can you solve it?

2741913295 avatar Jun 27 '25 02:06 2741913295