stable_diffusion.openvino icon indicating copy to clipboard operation
stable_diffusion.openvino copied to clipboard

Do u have the model in 16 bit ?

Open Khushboodholi opened this issue 2 years ago • 10 comments

I am looking to get the models in 16bit, currently I see its only 32bit.

Khushboodholi avatar Jan 23 '23 19:01 Khushboodholi

Only the most recent Intel CPUs support bfloat16.

RedAndr avatar Jan 26 '23 21:01 RedAndr

FP32 versus FP16 - not BFLOAT16 ;-)

brmarkus avatar Jan 27 '23 05:01 brmarkus

OpenVINO doesn't have FP16 at all for CPU. So, even if a model is in FP16 it will be calculated in FP32 anyway. Frankly, I see no point in having FP16 models in that case.

RedAndr avatar Jan 27 '23 19:01 RedAndr

A model in FP16 can be used with the OpenVINO CPU-plugin as well as with other plugins/for other devices (like a VisionProcessingUnit VPU). Wanting to use FP16 over FP32 can have other reasons as well.

brmarkus avatar Jan 30 '23 06:01 brmarkus

We can get meaningful acceleration on dGPU, iGPU if we use FP16. Where is the script for converting the model to IR? I can help compiling and testing that too.

raymondlo84 avatar Jan 31 '23 23:01 raymondlo84

Can you load the FP32 model and use FP16 for calculations? No problem.

RedAndr avatar Feb 02 '23 05:02 RedAndr

https://huggingface.co/raymondlo84/stable-diffusion-v1-4-openvino-fp16

We have created this for the community. We are getting a significant speed up on A770m (~1.8 it/s -> ~6.6 it/s), and it's now 1/2 of the model size and use much less VRAM.

You can try this without any code changes. But if you want to use the GPU, you have to change the device = "GPU" or "GPU.1" in the "stable_diffusion_engine.py" if you have multiple GPUs (iGPU + dGPU) like my setup.

class StableDiffusionEngine: def init( self, scheduler, model="bes-dev/stable-diffusion-v1-4-openvino", tokenizer="openai/clip-vit-large-patch14", device="GPU" ):

python demo.py --prompt "tree house" --model raymondlo84/stable-diffusion-v1-4-openvino-fp16

image

image

We also have a notebook that teaches how we convert, optimize, and also run these with OpenVINO. Check it out. https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/225-stable-diffusion-text-to-image

#and new pull request is coming to enable the image-to-image too. https://github.com/openvinotoolkit/openvino_notebooks/pull/805

Special thanks and credit: Ekaterina

Cheers

raymondlo84 avatar Feb 03 '23 18:02 raymondlo84

Can you load the FP32 model and use FP16 for calculations? No problem.

In 2023.0 version yes. But having FP16 also reduce the model size significantly.

raymondlo84 avatar Feb 03 '23 18:02 raymondlo84

https://huggingface.co/bes-dev/stable-diffusion-v1-4-openvino/discussions/4 Made a pull request to the main repos, and now it will use FP16. Hope I didn't break anything :)

raymondlo84 avatar Feb 03 '23 20:02 raymondlo84

Can you load the FP32 model and use FP16 for calculations? No problem.

In 2023.0 version yes. But having FP16 also reduce the model size significantly.

Yep, exactly twice ;) Say, from 4GB to 2GB, which is not a big deal at least to me.

Just worrying if FP16 usage would lead to precision loss. Although frankly, I couldn't find much of a difference between FP16 and FP32 in my experiments, which looks odd to me. It seems the initial SD model has been generated with FP16 already.

RedAndr avatar Feb 03 '23 22:02 RedAndr