matthieu-zimmer

Results 2 comments of matthieu-zimmer
trafficstars

I also have this issue with latest TRL and transformers versions for Llama finetuning in pure float16 training. Downgrading to 0.6.2 solves the issue.

> @matthieu-zimmer Did you try one of the suggested solutions above? param.data = param.data.float() is working for mixed precision yes, but not to train in pure float16.