zero123plus
zero123plus copied to clipboard
CUDA out of memory. Tried to allocate error
Greetings,
I'm trying to run gardio_app demo like stated in the readme and I'm getting this error:
$ python gradio_app.py /home/worker/zero123plus/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning:
resume_downloadis deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True. warnings.warn( text_encoder/model.safetensors not found Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00, 2.43it/s] Traceback (most recent call last): File "/home/worker/zero123plus/gradio_app.py", line 204, infire.Fire(run_demo) File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 143, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/worker/zero123plus/gradio_app.py", line 137, in run_demo pipeline.to(f'cuda:{_GPU_ID}') File "/home/worker/zero123plus/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 727, in to module.to(torch_device, torch_dtype) File "/home/worker/zero123plus/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1878, in to return super().to(*args, **kwargs) File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to return self._apply(convert) File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) [Previous line repeated 3 more times] File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 804, in _apply param_applied = fn(param) File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1159, in convert return t.to( torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU
any idea what is wrong? I've ran the setup like stated in the readme file
I understand now, your small example requires 5GB of vram, my gpu has only 4GB of vram, shame, is there any way to reduce memory consumption?
Nowadays we do have more techniques to reduce inference-time memory, including model offloading, autotune compile, quantization and maybe others. We do not have the code for these ready now but you are welcome to contribute. The first one does not incur any time overhead usually, while compiling will take some time before the start. Quantization would need some extra code and more tuning.