diffusers Add INT8 Stable Diffusion through Optimum

8-bit quantization is useful to improve the inference performance. This PR is to add INT8 quantization for Stable Diffusion through Optimum-Intel quantization API on top of Intel Neural Compressor. The sample code is implemented in Optimum-Intel.

Nov 17 '22 08:11 hshen14

@patrickvonplaten please review this one. Thanks.

Nov 17 '22 08:11 hshen14

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Nov 17 '22 08:11 HuggingFaceDocBuilderDev

cc @echarlaix @michaelbenayoun

Nov 17 '22 15:11 anton-l

Discussed with @echarlaix offline, seems that the neural-compressor+optimum integration will refactor its API quite soon? Should we hold off the promotion until then?

Nov 17 '22 16:11 anton-l

Hi @hshen14,

Let's wait for neural-compressor and optimum-intel refactorization before increasing visibility !

Nov 17 '22 17:11 echarlaix

Hi @hshen14,

Let's wait for neural-compressor and optimum-intel refactorization before increasing visibility !

Thanks @anton-l @echarlaix. Sure, let's do that.

Nov 18 '22 01:11 hshen14

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Dec 24 '22 15:12 github-actions[bot]

Currently, Optimum-Intel was being upgraded with INC v2.0 API. Will re-visit this PR after the upgrade is done.

Dec 24 '22 23:12 hshen14

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jan 19 '23 15:01 github-actions[bot]

bump to keep issue open

Jan 19 '23 19:01 Thomas-MMJ

@echarlaix , do you think it's good time to revisit this? Thanks.

Jan 19 '23 23:01 hshen14

Sure, I will work on it and open a PR on diffusers once everything is finalized, does that work for you @hshen14 ?

Feb 07 '23 13:02 echarlaix

great job.

May 26 '23 07:05 CrazyBoyM

Sure, I will work on it and open a PR on diffusers once everything is finalized, does that work for you @hshen14 ?

That would work perfectly! Thanks @echarlaix

May 26 '23 08:05 hshen14

Is int8 quantization still in the works? I would find this extremely helpful on some of the devices I'm trying to use, especially when running on cpu.

Feb 18 '24 20:02 Ender436

cc @yiyixuxu @sayakpaul @DN6 here

Feb 19 '24 12:02 patrickvonplaten

I think better person to tag here would be @echarlaix.

Feb 19 '24 12:02 sayakpaul