nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

dropout is 0.0

Open dipsivenkatesh opened this issue 11 months ago • 3 comments

The dropout of the GPT model in GPTConfig class is set to 0.0, this means there won't be any dropout when training though, correct?

dipsivenkatesh avatar Mar 11 '24 14:03 dipsivenkatesh

Yes, by deafult

amar-jay avatar Mar 17 '24 02:03 amar-jay

Many recent researches on LLMs proved that it's ok to not do dropout in pertaining. But you normally want dropout in fine tuning to avoid overfitting.

muerghq avatar Apr 06 '24 04:04 muerghq

Yes, Absolutely core because dropout of 0.0 during training shows that there is no dropout during training

siddharthji07 avatar May 13 '24 13:05 siddharthji07