infinity icon indicating copy to clipboard operation
infinity copied to clipboard

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel

Open tjtanaa opened this issue 1 year ago • 1 comments

Description

This is a PR that integrates OpenVINO backend into Infinity's Optimum Embedder class through the use of optimum-intel library.

Related Issue

If applicable, link the issue this PR addresses.

Types of Change

  • [X] New feature
  • [ ] Documentation update

Checklist

  • [X] I have read the CONTRIBUTING guidelines.
  • [X] My code follows the code style of this project.
  • [ ] I have added tests to cover my changes.
  • [ ] All new and existing tests passed.
  • [X] My changes generate no new warnings.
  • [ ] I have updated the documentation accordingly.

Additional Notes

There are multiple inferencing precisions that can be specified through in libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

"ov_config":{
    "INFERENCE_PRECISION_HINT": "bf16" # it supports fp32, fp16 and bf16
}

The Inference precision hint is hardcoded to bf16 because it offers the fastest inference speed.

We have also performed MTEB evaluation test (bankclassification dataset) on the INT4 weight only quantized model with BF16 inference precision, the drop in accuracy is just 0.71%.

Based on speed and accuracy tradeoff as well as the ease-of-use, we think that settling down on a single effective configuration could enhance the user experience of infinity_emb.

License

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.

tjtanaa avatar Nov 07 '24 08:11 tjtanaa

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 44.15584% with 43 lines in your changes missing coverage. Please review.

Project coverage is 78.20%. Comparing base (c9a8404) to head (295e840). Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
...nity_emb/infinity_emb/transformer/utils_optimum.py 42.30% 30 Missing :warning:
...y_emb/infinity_emb/transformer/embedder/optimum.py 43.47% 13 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #454      +/-   ##
==========================================
- Coverage   79.18%   78.20%   -0.99%     
==========================================
  Files          41       41              
  Lines        3248     3308      +60     
==========================================
+ Hits         2572     2587      +15     
- Misses        676      721      +45     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Nov 12 '24 00:11 codecov-commenter