infinity
infinity copied to clipboard
v0 add autoquant
This pull request introduces several changes to the infinity_emb library, focusing on adding support for a new autoquant data type, updating documentation, and improving the quantization process. The most important changes include adding the autoquant data type, updating the CLI documentation, modifying quantization logic, and adding unit tests for autoquant quantization.
New Features:
- Added
autoquantdata type toDtypeenum inlibs/infinity_emb/infinity_emb/primitives.py. - Updated quantization logic to handle
autoquantinlibs/infinity_emb/infinity_emb/transformer/quantization/interface.pyandlibs/infinity_emb/infinity_emb/transformer/quantization/quant.py[1] [2].
Documentation Updates:
- Updated CLI documentation to include
autoquantindocs/docs/cli_v2.md.
Codebase Improvements:
- Modified
Makefileto usepoetry runfor generating OpenAPI and CLI v2 documentation inlibs/infinity_emb/Makefile[1] [2].
Dependency Updates:
- Added
torchaoas an optional dependency inlibs/infinity_emb/pyproject.toml[1] [2].
Testing Enhancements:
- Added unit tests for
autoquantquantization inlibs/infinity_emb/tests/unit_test/transformer/quantization/test_interface.py.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
Attention: Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.
Project coverage is 73.24%. Comparing base (
0f1b786) to head (55a8b0e).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| ...infinity_emb/transformer/quantization/interface.py | 14.28% | 6 Missing :warning: |
| ...emb/infinity_emb/transformer/quantization/quant.py | 0.00% | 5 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:exclamation: There is a different number of reports uploaded between BASE (0f1b786) and HEAD (55a8b0e). Click for more details.
HEAD has 1 upload less than BASE
Flag BASE (0f1b786) HEAD (55a8b0e) 2 1
Additional details and impacted files
@@ Coverage Diff @@
## main #402 +/- ##
==========================================
- Coverage 79.01% 73.24% -5.77%
==========================================
Files 40 40
Lines 3173 3184 +11
==========================================
- Hits 2507 2332 -175
- Misses 666 852 +186
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.