diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

RuntimeError: operator torchvision::nms does not exist

Open kadirnar opened this issue 1 year ago • 3 comments

Describe the bug

I get this error when I use the xformers parameter.

  File "/root/projects/vton_train/train_text_to_image_sdxl.py", line 43, in <module>
    from torchvision import transforms
  File "/usr/local/lib/python3.10/dist-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 467, in inner
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/usr/local/lib/python3.10/dist-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

Reproduction

!accelerate launch train_text_to_image_sdxl.py \
  --pretrained_model_name_or_path="SG161222/RealVisXL_V4.0" \
  --pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
  --dataset_name="lambdalabs/naruto-blip-captions" \
  --enable_xformers_memory_efficient_attention \
  --resolution=512 --center_crop --random_flip \
  --proportion_empty_prompts=0.2 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 --gradient_checkpointing \
  --max_train_steps=10000 \
  --learning_rate=1e-06 --lr_scheduler="constant" --lr_warmup_steps=0 \
  --mixed_precision="fp16" \
  --report_to="wandb" \
  --validation_prompt="a photo of a model wearing" --validation_epochs 5 \
  --checkpointing_steps=5000 \
  --output_dir="sdxl-vton-train" \
  --push_to_hub

Logs

No response

System Info

  • 🤗 Diffusers version: 0.28.0.dev0
  • Platform: Ubuntu 22.04.3 LTS - Linux-5.15.0-105-generic-x86_64-with-glibc2.35
  • Running on a notebook?: No
  • Running on Google Colab?: No
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.3.0+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.23.0
  • Transformers version: 4.41.0
  • Accelerate version: 0.30.1
  • PEFT version: 0.7.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.3
  • xFormers version: 0.0.26.post1
  • Accelerator: NVIDIA RTX A6000, 49140 MiB VRAM
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@yiyixuxu @sayakpaul @DN6

kadirnar avatar May 19 '24 09:05 kadirnar

The logs suggest this is an installation bug.

Are you able to run torchvision import successfully?

sayakpaul avatar May 19 '24 09:05 sayakpaul

In addition to what Sayak said, I guess the installation of xFormers==0.0.26.post1 could "break" the environment. xFormers forces its preferred torch, cuda-related packages. @kadirnar, could you just upgrade your torchvision?

tolgacangoz avatar May 19 '24 09:05 tolgacangoz

@sayakpaul @standardAI ❤️ I reinstalled and resolved this error. Thanks for your help.

I want to perform text2image finetuning with multi GPU using the SDXL model. But there is no parameter for this. How can I solve it?

kadirnar avatar May 20 '24 09:05 kadirnar

Closing this as resolved. Feel free to reopen if needed 👍🏽

DN6 avatar May 21 '24 13:05 DN6