optimum-habana
optimum-habana copied to clipboard
Optimized inference of XGLM model on HPU
What does this PR do?
Optimized inference of XGLM model on HPU.
Before submitting
- [x] Did you make sure to update the documentation with your changes?
- [x] Did you write any new necessary tests?