llm.c
llm.c copied to clipboard
from-scratch init the model
Implement the from-scratch initialization following the nanoGPT repo.
This will allow instantiating randomly-initializes models of all GPT-2 sizes, for timing/debugging purposes, and to make sure we don't overfit to a single configuration too much.
see https://github.com/karpathy/llm.c/pull/156