Azret Botash
Azret Botash
There are two dev scripts in this PR **1. gpt2-124M-from-scratch.py** Simply creates a new GPT-2 124M model from scratch and saves the corresponding weights to gpt2_124M.bin. Will be useful when...
Minimum set of support files to build on Windows. Use build_msvc.bat in a Visual Studio Command x64 Prompt to build with MSVC. See #19 for some comments re /fp:fast and...
I'm trying to parallelize the layernorm_backward and encoder_backward. I need some help making the CPU atomicAdd portable. I know there is already one for CUDA. ```c // TODO: Make this...
Can we make the block_size in the kernels more adaptive or parameterized? e.g. 1024 is pretty big for my GPU with 12GB of memory. I have to run with block_size...
https://github.com/karpathy/llm.c/blob/7b79fbb230fd78bd96684b0a8728a3951d66c940/dev/cuda/layernorm_backward.cu#L208 Should all of these use atomicAdd?
Adding numerically identical to torch rand utils for when we need to init_from_scratch https://github.com/karpathy/llm.c/issues/243 We can init on a CPU and mem copy to GPU. Usage: ```c mt19937_state state; manual_seed(&state,...