IF Kernel crash on loading model in Ubuntu 22.04

Hey, I'm trying to load the model into 24GB VRAM GPU.

This is my code from diffusers import DiffusionPipeline from diffusers.utils import pt_to_pil import torch

stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", torch_dtype=torch.float16) stage_1.enable_xformers_memory_efficient_attention() stage_1.enable_model_cpu_offload()

The kernel crashes while loading the model into the memory, I tried loading from deepfloyd_if same thing it also crashes while running the following code. from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII from deepfloyd_if.modules.t5 import T5Embedder

device = 'cuda:0' if_I = IFStageI('IF-I-XL-v1.0', device=device) if_II = IFStageII('IF-II-L-v1.0', device=device) if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device) t5 = T5Embedder(device="cpu")

This is the error shown in the notebook, Canceled future for execute_request message before replies were done The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.

I tracked memory usage it is not passing 14GB mark, how do I resolve it?

Apr 29 '23 01:04 vinaysingh8866

please provide the "more info" referenced in the error message

Apr 29 '23 01:04 brycedrennan

The log is pasted below info 18:00:08.165: Process Execution: > ~/floyd/.conda/bin/python -c "import ipykernel; print(ipykernel.version); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.file)"

~/floyd/.conda/bin/python -c "import ipykernel; print(ipykernel.version); print("5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d"); print(ipykernel.file)" info 18:00:08.202: Process Execution: > ~/floyd/.conda/bin/python -m ipykernel_launcher --ip=127.0.0.1 --stdin=9003 --control=9001 --hb=9000 --Session.signature_scheme="hmac-sha256" --Session.key=b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb" --shell=9002 --transport="tcp" --iopub=9004 --f=/home/vinay/.local/share/jupyter/runtime/kernel-v2-7335xL6PqB389I5C.json ~/floyd/.conda/bin/python -m ipykernel_launcher --ip=127.0.0.1 --stdin=9003 --control=9001 --hb=9000 --Session.signature_scheme="hmac-sha256" --Session.key=b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb" --shell=9002 --transport="tcp" --iopub=9004 --f=/home/vinay/.local/share/jupyter/runtime/kernel-v2-7335xL6PqB389I5C.json info 18:00:08.202: Process Execution: cwd: ~/floyd cwd: ~/floyd info 18:00:08.503: ipykernel version & path 6.15.0, ~/floyd/.conda/lib/python3.10/site-packages/ipykernel/init.py for /home/vinay/floyd/.conda/bin/python info 18:00:09.281: ZMQ loaded via fallback mechanism. info 18:00:09.322: Got new session 2335260b-cb88-4d60-a3df-8541ee408777 info 18:00:09.322: Started new restart session error 18:02:54.121: Disposing session as kernel process died ExitCode: undefined, Reason: /home/vinay/floyd/.conda/lib/python3.10/site-packages/traitlets/traitlets.py:2548: FutureWarning: Supporting extra quotes around strings is deprecated in traitlets 5.0. You can use 'hmac-sha256' instead of '"hmac-sha256"' if you require traitlets >=5. warn( /home/vinay/floyd/.conda/lib/python3.10/site-packages/traitlets/traitlets.py:2499: FutureWarning: Supporting extra quotes around Bytes is deprecated in traitlets 5.0. Use '1c81a2d0-ca31-4e1d-aac0-968179c8dbcb' instead of 'b"1c81a2d0-ca31-4e1d-aac0-968179c8dbcb"'. warn(

info 18:02:54.259: Dispose Kernel process 8755. error 18:02:54.308: Raw kernel process exited code: undefined error 18:02:55.244: Error in waiting for cell to complete Error: Canceled future for execute_request message before replies were done at t.KernelShellFutureHandler.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:32419) at /home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:51471 at Map.forEach () at y._clearKernelState (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:51456) at y.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:44938) at /home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:96826 at ee (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:2:1589492) at jh.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:96802) at Lh.dispose (/home/vinay/.vscode/extensions/ms-toolsai.jupyter-2023.3.1201040234/out/extension.node.js:17:104079) at processTicksAndRejections (node:internal/process/task_queues:96:5) warn 18:02:55.333: Cell completed with errors { message: 'Canceled future for execute_request message before replies were done' }

These are all the packages accelerate==0.15.0 antlr4-python3-runtime==4.9.3 asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1670263926556/work backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work backports.functools-lru-cache @ file:///home/conda/feedstock_root/build_artifacts/backports.functools_lru_cache_1618230623929/work beautifulsoup4==4.11.2 certifi==2022.12.7 charset-normalizer==3.1.0 clip @ git+https://github.com/openai/CLIP.git@a9b1bf5920416aaeaec965c25dd9e8f98c864f16 cmake==3.26.3 contourpy==1.0.7 cycler==0.11.0 debugpy @ file:///home/builder/ci_310/debugpy_1640789504635/work decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work deepfloyd-if==1.0.1 diffusers==0.16.1 entrypoints @ file:///home/conda/feedstock_root/build_artifacts/entrypoints_1643888246732/work executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1667317341051/work filelock==3.12.0 fonttools==4.39.3 fsspec==2023.4.0 ftfy==6.1.1 huggingface-hub==0.14.1 idna==3.4 importlib-metadata==6.6.0 ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1655369107642/work ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1682709228762/work ipywidgets==8.0.6 jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1669134318875/work jupyter-client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1654730843242/work jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1678994169527/work jupyterlab-widgets==3.0.7 kiwisolver==1.4.4 lit==16.0.2 matplotlib==3.7.1 matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work mypy-extensions==1.0.0 nest-asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1664684991461/work numpy==1.24.3 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 omegaconf==2.3.0 packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1681337016113/work parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1667297516076/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work Pillow==9.5.0 platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1682644429438/work prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1677600924538/work protobuf==3.20.0 psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl pure-eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1642875951954/work Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1681904169130/work pyparsing==3.0.9 pyre-extensions==0.0.23 python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work PyYAML==6.0 pyzmq @ file:///croot/pyzmq_1682697643292/work regex==2023.3.23 requests==2.29.0 sentencepiece==0.1.98 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work soupsieve==2.4.1 stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work tokenizers==0.13.3 torch==1.13.1 torchvision==0.14.1 tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1648827254365/work tqdm==4.65.0 traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1675110562325/work transformers==4.25.1 triton==2.0.0.post1 typing-inspect==0.8.0 typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1678559861143/work urllib3==1.26.15 wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1673864653149/work widgetsnbextension==4.0.7 xformers==0.0.16 zipp==3.15.0 Note: you may need to restart the kernel to use updated packages.

Apr 29 '23 16:04 vinaysingh8866

I tried running the colab locally and it throws

A: torch.Size([77, 4096]), B: torch.Size([4096, 4096]), C: (77, 4096); (lda, ldb, ldc): (c_int(2464), c_int(131072), c_int(2464)); (m, n, k): (c_int(77), c_int(4096), c_int(4096))cuBLAS API failed with status 15 error detected Output exceeds the size limit. Open the full output data in a text editor--------------------------------------------------------------------------- Exception Traceback (most recent call last) Cell In[10], line 1 ----> 1 prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)

File ~/floyd/.conda/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs) 112 @functools.wraps(func) 113 def decorate_context(*args, **kwargs): 114 with ctx_factory(): --> 115 return func(*args, **kwargs)

File ~/floyd/.conda/lib/python3.10/site-packages/diffusers/pipelines/deepfloyd_if/pipeline_if.py:324, in IFPipeline.encode_prompt(self, prompt, do_classifier_free_guidance, num_images_per_prompt, device, negative_prompt, prompt_embeds, negative_prompt_embeds, clean_caption) 317 logger.warning( 318 "The following part of your input was truncated because CLIP can only handle sequences up to" 319 f" {max_length} tokens: {removed_text}" 320 ) 322 attention_mask = text_inputs.attention_mask.to(device) --> 324 prompt_embeds = self.text_encoder( 325 text_input_ids.to(device), 326 attention_mask=attention_mask, 327 ) 328 prompt_embeds = prompt_embeds[0] 330 if self.text_encoder is not None:

File ~/floyd/.conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs) ... -> 1436 raise Exception('cublasLt ran into an error!') 1438 torch.cuda.set_device(prev_device) 1440 return out, Sout

Exception: cublasLt ran into an error!

While running prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)

Apr 30 '23 02:04 vinaysingh8866