infinity icon indicating copy to clipboard operation
infinity copied to clipboard

fix: implement lazy imports for BetterTransformer

Open aryasaatvik opened this issue 5 months ago • 3 comments

Related Issue

Fixes import errors when using --no-bettertransformer flag with transformers >= 4.49.

Summary

This PR implements lazy imports for BetterTransformer to prevent import errors when it's disabled or when using incompatible transformers versions. BetterTransformer is deprecated in optimum and requires transformers < 4.49.

Changes:

  • Replaced eager imports with lazy loading pattern
  • Added _import_bettertransformer() function for on-demand imports
  • Wrapped imports in try/except to handle version conflicts gracefully
  • Returns early when --no-bettertransformer is used, avoiding unnecessary imports

Why this is needed:

  • BetterTransformer is deprecated and incompatible with transformers >= 4.49
  • Users with modern transformers versions get import errors even with --no-bettertransformer
  • This allows the flag to work as intended

Checklist

  • [x] I have read the CONTRIBUTING guidelines.
  • [ ] I have added tests to cover my changes.
  • [ ] I have updated the documentation (docs folder) accordingly.

Additional Notes

This change ensures backward compatibility for users who still want BetterTransformer while fixing the issue for users with newer dependencies.

aryasaatvik avatar Jul 10 '25 12:07 aryasaatvik

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

:x: Patch coverage is 72.22222% with 5 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 79.54%. Comparing base (f84d8e2) to head (947044d).

Files with missing lines Patch % Lines
...inity_emb/infinity_emb/transformer/acceleration.py 72.22% 5 Missing :warning:
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #619   +/-   ##
=======================================
  Coverage   79.54%   79.54%           
=======================================
  Files          43       43           
  Lines        3486     3501   +15     
=======================================
+ Hits         2773     2785   +12     
- Misses        713      716    +3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Aug 27 '25 07:08 codecov-commenter

Absolutely. According to this issue:

please use transformers' attention implementation: https://huggingface.co/docs/transformers/main/en/llm_optims#attention and torch.compile (with static cache if decoder): https://huggingface.co/docs/transformers/main/en/llm_optims#static-kv-cache-and-torchcompile for the best possible performance (exceeding bettertransformer, which no one maintains! 💀).

What are your thoughts on this method vs continue support for current bettertransformer?

wirthual avatar Aug 27 '25 19:08 wirthual

@michaelfeil One option to keep bettertransformer as was is #641. Extracted bettertransformer code in its own package. This should get rid of version error while keeping the orig bettertransformer code. Not sure if/how long this method can work.

wirthual avatar Aug 29 '25 10:08 wirthual