ml-stable-diffusion diffusers==0.16.0 not working with PyTorch default v2.0.0

Hi team, I got an error when executing python -m python_coreml_stable_diffusion.torch2coreml, it seems diffusers==0.16.0 not working on PyTorch v2.0.0. Use diffusers==0.15.1 works. See following threads.

    convert_nodes(self.context, self.graph)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 83, in convert_nodes
    raise RuntimeError(
RuntimeError: PyTorch convert function for op 'scaled_dot_product_attention' not implemented.

Apr 27 '23 14:04 axot

What version of diffusers are you using. I think I received that error yesterday when I tried upgrading from diffusers 0.15.1 (or from 0.14.0) to the new 0.16.0. I use torch 2.0.0 without issues.

Apr 27 '23 14:04 jrittvo

here is the packages list, with diffusers==0.16.0

accelerate==0.18.0
aiohttp==3.8.4
aiosignal==1.3.1
antlr4-python3-runtime==4.9.3
async-timeout==4.0.2
attrs==23.1.0
certifi==2022.12.7
charset-normalizer==3.1.0
coremltools==6.3.0
diffusers==0.16.0
filelock==3.12.0
frozenlist==1.3.3
fsspec==2023.4.0
huggingface-hub==0.14.1
idna==3.4
importlib-metadata==6.6.0
Jinja2==3.1.2
lightning-utilities==0.8.0
MarkupSafe==2.1.2
mpmath==1.3.0
multidict==6.0.4
networkx==3.1
numpy==1.24.3
omegaconf==2.3.0
packaging==23.1
Pillow==9.5.0
protobuf==3.20.3
psutil==5.9.5
pytorch-lightning==2.0.2
PyYAML==6.0
regex==2023.3.23
requests==2.29.0
safetensors==0.3.1
scipy==1.10.1
sympy==1.11.1
tokenizers==0.13.3
torch==2.0.0
torchmetrics==0.11.4
tqdm==4.65.0
transformers==4.28.1
typing_extensions==4.5.0
urllib3==1.26.15
yarl==1.9.2
zipp==3.15.0

Apr 27 '23 15:04 axot

Can you switch to diffusers 0.15.1 and try it again. My memory is not perfect, but I think the error I got trying with 0.16.0 is the error you have.

Apr 27 '23 15:04 jrittvo

yes, the job is still going on, but after return back to diffusers 0.15.1, the error was gone.

Apr 27 '23 15:04 axot

If your job completes successfully, that seems to mean that diffusers 0.16.0 introduced a software regression that is causing the error for both of us. Perhaps revise the title of this thread and hopefully the issue can get fixed?

Apr 27 '23 15:04 jrittvo

@pcuenca Do you have a recommended resolution/fix on the diffusers side? If not, I will think about a good solution on this side. 🙏

Apr 27 '23 20:04 atiorh

Oh, that's probably because scaled dot-product attention is enabled by default if torch 2 is in use. pipe.unet.set_default_attn_processor() should work. I can test and submit a PR in a few hours.

Edit: The same behaviour (enabling SDPA by default) was already present in 0.15.1 so we might indeed have a regression. I'll take a closer look.

Apr 27 '23 21:04 pcuenca

Oh, that's probably because scaled dot-product attention is enabled by default if torch 2 is in use. pipe.unet.set_default_attn_processor() should work. I can test and submit a PR in a few hours.

I'm running into this issue now when trying to convert diffusers to mlmodel. Is there somewhere I can place: pipe.unet.set_default_attn_processor() ?

Error:

  File "/Users/.pyenv/versions/coremlsd/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 6058, in scaled_dot_product_attention
    q, k, v, attn_mask, dropout, is_causal = _get_inputs(context, node, expected=6)
  File "/Users/.pyenv/versions/coremlsd/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 200, in _get_inputs
    raise ValueError(
ValueError: node hidden_states.11 (scaled_dot_product_attention) got 7 input(s), expected [6]

Mac OS 14

accelerate 0.23.0 antlr4-python3-runtime 4.9.3 attrs 23.1.0 cattrs 23.1.2 certifi 2023.7.22 charset-normalizer 3.3.0 contourpy 1.1.1 coremltools 7.0b2 cycler 0.12.0 diffusers 0.21.4 exceptiongroup 1.1.3 filelock 3.12.4 fonttools 4.43.0 fsspec 2023.9.2 huggingface-hub 0.17.3 idna 3.4 importlib-metadata 6.8.0 iniconfig 2.0.0 inquirerpy 0.3.4 invisible-watermark 0.2.0 Jinja2 3.1.2 joblib 1.3.2 kiwisolver 1.4.5 MarkupSafe 2.1.3 matplotlib 3.8.0 mpmath 1.3.0 networkx 3.1 numpy 1.26.0 omegaconf 2.3.0 opencv-python 4.8.1.78 packaging 23.2 pfzy 0.3.4 Pillow 10.0.1 pip 23.2.1 pluggy 1.3.0 prompt-toolkit 3.0.39 protobuf 3.20.3 psutil 5.9.5 pyaml 23.9.6 pyparsing 3.1.1 pytest 7.4.2 python-dateutil 2.8.2 PyWavelets 1.4.1 PyYAML 6.0.1 regex 2023.10.3 requests 2.31.0 safetensors 0.3.3 scikit-learn 1.3.1 scipy 1.11.3 setuptools 58.1.0 six 1.16.0 sympy 1.12 threadpoolctl 3.2.0 tokenizers 0.13.3 tomli 2.0.1 torch 2.2.0.dev20231002 torchaudio 2.2.0.dev20231002 torchvision 0.17.0.dev20231002 tqdm 4.66.1 transformers 4.29.2 typing_extensions 4.8.0 urllib3 2.0.6 wcwidth 0.2.7 zipp 3.17.0

Oct 05 '23 02:10 rovo79

Hello @rovo79! Conversion works for me.

Would you mind sharing the exact conversion command you used, so we can try to reproduce?
Did you try with the stable version of PyTorch instead of a dev one?

Oct 05 '23 10:10 pcuenca

Update: I could reproduce with PyTorch 2.1.0, which was released yesterday. In the meantime, I recommend you use PyTorch 2.0.1 to convert your model.

Oct 05 '23 10:10 pcuenca

Another workaround is to add the following line after the pipeline has been loaded:

pipe.vae.set_default_attn_processor()

Oct 05 '23 11:10 pcuenca

Hi @pcuenca !

Here is my conversion command:

python -m python_coreml_stable_diffusion.torch2coreml \
    --model-version stabilityai/stable-diffusion-xl-base-1.0 \
    --convert-unet \
    --convert-text-encoder \
    --convert-vae-decoder \
    --convert-safety-checker \
    --quantize-nbits 6 \
    --attention-implementation SPLIT_EINSUM \
    --compute-unit ALL \
    --bundle-resources-for-swift-cli \
    --check-output-correctness \
    -o models/split_einsum/stable-diffusion-xl-base-1-0

I was initially running on the Nightly, but have since dropped back to the latest stable. I am on a CoreMLTools beta... :

torch 2.1.0
torchaudio 2.1.0
torchvision 0.16.0
coremltools 7.0b2

sw_vers ProductName: macOS ProductVersion: 14.0 BuildVersion: 23A344

Oct 05 '23 11:10 rovo79

I think the pipe.vae.set_default_attn_processor() worked!

I added it to torch2coreml.py as:

    class VAEDecoder(nn.Module):
        """ Wrapper nn.Module wrapper for pipe.decode() method
        """

        def __init__(self):
            super().__init__()
            self.post_quant_conv = pipe.vae.post_quant_conv.to(dtype=torch.float32)
            self.decoder = pipe.vae.decoder.to(dtype=torch.float32)
            pipe.vae.set_default_attn_processor()

PSNR changed by -157 dB! That doesn't sound good, but maybe it's no big deal. INFO:__main__:vae_decoder baseline PyTorch to baseline CoreML: PSNR changed by -157.3 dB (198.1 -> 40.7)

Oct 05 '23 11:10 rovo79

Another workaround is to add the following line after the pipeline has been loaded:
pipe.vae.set_default_attn_processor()

Worked for me when I encountered the same problem as @rovo79 using pytorch 2.1.0. And adding this line solved the issue.

➜  ml-stable-diffusion git:(main) python -m python_coreml_stable_diffusion.torch2coreml --custom-vae-version madebyollin/sdxl-vae-fp16-fix --convert-unet --convert-vae-decoder --convert-text-encoder --xl-version --model-version stabilityai/stable-diffusion-xl-base-1.0 --refiner-version stabilityai/stable-diffusion-xl-refiner-1.0 --bundle-resources-for-swift-cli --attention-implementation ORIGINAL -o output
scikit-learn version 1.3.1 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Torch version 2.1.0 has not been tested with coremltools. You may run into unexpected errors. Torch 2.0.0 is the most recent version that has been tested.
INFO:__main__:Initializing DiffusionPipeline with stabilityai/stable-diffusion-xl-base-1.0..
(…)xl-vae-fp16-fix/resolve/main/config.json: 100%|██████████████████████████████████████████| 631/631 [00:00<00:00, 263kB/s]
diffusion_pytorch_model.safetensors: 100%|███████████████████████████████████████████████| 335M/335M [00:04<00:00, 69.6MB/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 18.89it/s]
INFO:__main__:Done. Pipeline in effect: StableDiffusionXLPipeline
INFO:__main__:Attention implementation in effect: AttentionImplementations.ORIGINAL
INFO:__main__:Converting vae_decoder
/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/diffusers/models/resnet.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert hidden_states.shape[1] == self.channels
/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/diffusers/models/resnet.py:152: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if hidden_states.shape[0] >= 64:
INFO:__main__:Converting vae_decoder to CoreML..
Converting PyTorch Frontend ==> MIL Ops:  22%|████████▉                                | 80/369 [00:00<00:00, 6320.53 ops/s]
Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/lixiangyi/Projects/supergen/ml-stable-diffusion/python_coreml_stable_diffusion/torch2coreml.py", line 1524, in <module>
    main(args)
  File "/Users/lixiangyi/Projects/supergen/ml-stable-diffusion/python_coreml_stable_diffusion/torch2coreml.py", line 1319, in main
    convert_vae_decoder(pipe, args)
  File "/Users/lixiangyi/Projects/supergen/ml-stable-diffusion/python_coreml_stable_diffusion/torch2coreml.py", line 519, in convert_vae_decoder
    coreml_vae_decoder, out_path = _convert_to_coreml(
  File "/Users/lixiangyi/Projects/supergen/ml-stable-diffusion/python_coreml_stable_diffusion/torch2coreml.py", line 124, in _convert_to_coreml
    coreml_model = ct.convert(
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 551, in convert
    mlmodel = mil_convert(
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 286, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 75, in load
    return _perform_torch_convert(converter, debug)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 114, in _perform_torch_convert
    prog = converter.convert()
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 484, in convert
    convert_nodes(self.context, self.graph)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 93, in convert_nodes
    add_op(context, node)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 6219, in scaled_dot_product_attention
    q, k, v, attn_mask, dropout, is_causal = _get_inputs(context, node, expected=6)
  File "/opt/homebrew/anaconda3/envs/apple_sd/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 204, in _get_inputs
    raise ValueError(
ValueError: node hidden_states.11 (scaled_dot_product_attention) got 7 input(s), expected [6]

Oct 22 '23 09:10 xdotli

There is a PR that I think fixes this issue here: https://github.com/apple/coremltools/pull/2021

Oct 22 '23 16:10 jrittvo