matthieu-zimmer comments

Repositories
Issues
Comments

Results 2 comments of


                                            matthieu-zimmer

trafficstars

Different versions seem to have an impact on the results

I also have this issue with latest TRL and transformers versions for Llama finetuning in pure float16 training. Downgrading to 0.6.2 solves the issue.

Different versions seem to have an impact on the results

> @matthieu-zimmer Did you try one of the suggested solutions above? param.data = param.data.float() is working for mixed precision yes, but not to train in pure float16.