OLMo icon indicating copy to clipboard operation
OLMo copied to clipboard

Look at data right where the spike happens

Open dirkgr opened this issue 2 years ago • 2 comments

It is suspicious that we had two slightly different models (one with biases, one without), that both spiked at exactly the same moment. This suggests there might be a data issue.

dirkgr avatar Aug 18 '23 21:08 dirkgr

In block 0, exp_avg_sq for attn_norm.weight.max seems to spike on step 1581, earlier than all the other spikes.

dirkgr avatar Aug 24 '23 06:08 dirkgr

attn_out.weight.max is even more pronounced.

dirkgr avatar Aug 24 '23 06:08 dirkgr

Marking the items prior to Feb 29th as "closed".

dumitrac avatar Apr 30 '24 20:04 dumitrac