llm.c from-scratch init the model

from-scratch init the model

Open karpathy opened this issue 10 months ago • 1 comments

Implement the from-scratch initialization following the nanoGPT repo.

This will allow instantiating randomly-initializes models of all GPT-2 sizes, for timing/debugging purposes, and to make sure we don't overfit to a single configuration too much.

Apr 16 '24 20:04 karpathy

see https://github.com/karpathy/llm.c/pull/156

Apr 16 '24 21:04 azret

llm.c llm.c copied to clipboard

from-scratch init the model

llm.c
llm.c copied to clipboard