diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Explore NVIDIA/TransformerEngine for speed/efficiency

Open 0xdevalias opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like

  • https://github.com/NVIDIA/TransformerEngine
    • A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Describe alternatives you've considered

  • That this lib won't be useful in this repo, or that existing optimisations already do things as well as it could (or better).

Additional context

  • Crossposted on:
    • https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/4721
  • Other potential speed gain/improvement issues:
    • https://github.com/huggingface/diffusers/issues/1094
    • https://github.com/huggingface/diffusers/issues/1212
      • https://github.com/huggingface/diffusers/issues/1212#issuecomment-1313051895

0xdevalias avatar Nov 14 '22 23:11 0xdevalias

Hey @0xdevalias,

Thanks a lot for opening the issue! Just to better understand, what benefits does https://github.com/NVIDIA/TransformerEngine give besides 8-bit quantizition that we don't currently have with xformers: https://github.com/facebookresearch/xformers or other optimization libraries? :-)

patrickvonplaten avatar Nov 18 '22 12:11 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 15 '22 15:12 github-actions[bot]