Andrej issues

Results 18 issues of


                                            Andrej

test_gpt2.cu correctness bounds tune per-parameter

adding a todo this is me being a bit paranoid but in test_gpt2.cu we check that our code agrees with pytorch reference. we're using a single global threshold for all...

good first issue

feature-request

WikiText 103 evaluation

I've seen some repos use WikiText-103 as the dataset they use to eval GPT-like models, e.g.: https://github.com/tysam-code/hlb-gpt/tree/main Add prepro script to download and preprocess and tokenize WikiText-103 just like tiny...

good first issue

WikiText103 eval, attempt to reproduce Alec table posted on Reddit

I get: ``` gpt2-124M nll: 3.058462142944336, ppl: 21.294784545898438 ``` And we're supposed to get https://www.reddit.com/r/MachineLearning/comments/oye64h/comment/h7ucco2/ i.e. 1.17 (in what i assume is the nll), so we're not even close to...

add checkpoint function write to file

Model Export & Inference

I'd be very interested in how we could take llm.c models and export them into universal formats, e.g. for very fast inference in llama.cpp, vllm, or etc. Or how they...

feature-request

Deleting Conda/Python as a dependency entirely to dramatically decrease "latency to step"

Following up on this [tweet](https://x.com/karpathy/status/1795501945832247790), copy pasting, and just creating an Issue as a TODO. """ The thing that makes this a bit complicated right now is the start latency....

add train_llama31.py

Support the training, finetuning of Llama 3.1 on the Python side only for now, to create reference tensors to match in C later.

add llama 3 support to llm.c

This branch starts with a copy paste of `train_gpt2.cu` and `test_gpt2.cu`, but these two files (and other files) will change to incorporate Llama 3.1 support, before merging back to master.