C
C
If you want to use xformers with a version that the official pypi source does not provide a prebuilt binary wheels for a specific torch version, you could download the...
And `stable-fast` has no binary dependencies on `xformers` so the failure of loading the C extension should be caused by other reasons. Anyway, `xformers` is just an optional requirement and...
@arnavdantuluri It would be great! I don't have more than one GPU so haven't even considered tensor parallelism. And writing fx passes is more complicated than torchscript so I haven't...
@arnavdantuluri Aha, in fact, as you can see, we have a Discord server here: https://discord.gg/kQFvfzM4SJ
@jkrauss82 Sorry, FP8 kernels aren't implemented and I guess I lack the time to support them now.
@jkrauss82 I have created one new project which supports FP8 inference with diffusers. However, it has not been open-sourced. I hope it could be made publicly soon...
@Nucleon729 Try using the following command to install: ```shell pip3 install -e --no-build-isolation -v --no-use-pep517 --debug ```
> > Is it planned? > > Currently getting this error when trying to run ComfyUI in fp8 (flags `--fp8_e4m3fn-text-enc --fp8_e4m3fn-unet`): > > ``` > > RuntimeError: "addmm_cuda" not implemented...
This shouldn't happen. What's your script?
When I run `python3 examples/optimize_lcm_lora.py`, I still see a significant speedup improvement. So I don't know what's wrong.