[Feature Request]: Make use of Optimum / BetterTransformer optimisations
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
- https://huggingface.co/docs/optimum/bettertransformer/overview
-
Optimumprovides an integration withBetterTransformer, a stable API from PyTorch to benefit from interesting speedups on CPU & GPU through sparsity and fused kernels. -
Since its 1.13 version, PyTorch released the stable version of BetterTransformer in its library. You can benefit from interesting speedup on most consumer-type devices, including CPUs, older and newer versions of NIVIDIA GPUs. You can now use this feature in 🤗 Optimum together with Transformers and use it for major models in the Hugging Face ecosystem.
-
- https://huggingface.co/docs/optimum/bettertransformer/tutorials/contribute
-
Adding BetterTransformer support for new architectures
-
- https://twitter.com/huggingface/status/1594783600855158805
-
A collaboration with @PyTorch to make transformer-based models faster using optimum library! Up to 4.5x speedup for text, vision and audio models using a one liner! 🔥
-
- https://twitter.com/PyTorch/status/1594766050851102720
-
Better Transformer for #PyTorch out of the box performance on 🤗 @huggingface models now available! Want to know more about the collaboration? 👀 the blog
- https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2
-
Proposed workflow
- Explore what would be required to integrate with Optimum / BetterTransformer
- Make those integrations
- ???
- Profit!
Additional information
See also:
- https://github.com/huggingface/diffusers/issues/1389
- https://github.com/huggingface/optimum/issues/512