direct-preference-optimization
direct-preference-optimization copied to clipboard

Published 20 hours ago •

Reame
Issues

Qwen model issues & embedding and loss has nan

Open lylcst opened this issue 1 year ago • 5 comments

after a loss backward and optimizer step, then forward the embedding layer output hidden states become inf and loss is nan.

Nov 03 '23 12:11 lylcst