matthieu-zimmer
Results
2
comments of
matthieu-zimmer
trafficstars
I also have this issue with latest TRL and transformers versions for Llama finetuning in pure float16 training. Downgrading to 0.6.2 solves the issue.
> @matthieu-zimmer Did you try one of the suggested solutions above? param.data = param.data.float() is working for mixed precision yes, but not to train in pure float16.