transformers
transformers copied to clipboard
exclude fsdp from delay_optimizer_creation
What does this PR do?
It passes the model and the optimizer to accelerate.prepare in order to enable fp8 mixed precision, if any.
Fixes #34024
Who can review?
Library:
- trainer: @muellerzr and @SunMarc
-->