[Feature Request]: Make use of Optimum / BetterTransformer optimisations

Open 0xdevalias opened this issue 3 years ago • 0 comments

https://huggingface.co/docs/optimum/bettertransformer/overview
- Optimum provides an integration with BetterTransformer, a stable API from PyTorch to benefit from interesting speedups on CPU & GPU through sparsity and fused kernels.
- Since its 1.13 version, PyTorch released the stable version of BetterTransformer in its library. You can benefit from interesting speedup on most consumer-type devices, including CPUs, older and newer versions of NIVIDIA GPUs. You can now use this feature in 🤗 Optimum together with Transformers and use it for major models in the Hugging Face ecosystem.
https://huggingface.co/docs/optimum/bettertransformer/tutorials/contribute
- Adding BetterTransformer support for new architectures
https://twitter.com/huggingface/status/1594783600855158805
- A collaboration with @PyTorch to make transformer-based models faster using optimum library! Up to 4.5x speedup for text, vision and audio models using a one liner! 🔥
https://twitter.com/PyTorch/status/1594766050851102720
- Better Transformer for #PyTorch out of the box performance on 🤗 @huggingface models now available! Want to know more about the collaboration? 👀 the blog
  - https://medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2