nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Results 297 nanoGPT issues
Sort by recently updated
recently updated
newest added

Now write takes ~40s on my machine (~40x improvement, ~500MB/s write). Uses similar logic to the first implementation but with batching, so offsets are also calculated per batch not per...

First of all, tank you Andrej Karpathy for the amazing Youtube series you have done on deep learning and for being such a great educator on the subject online. Regarding...

"Hello! I've been trying to run the train.py on a 2060 GPU, but this device does not support dtype=torch.bfloat16. What changes would I have to make to achieve my goal?...

Brings write down to 2 min from 30 min. I also understand if you want to keep it simple since the current one is quite easy to understand. This one...

Heads up! (to anyone who played with this when it was released) The repo's readme has: > Code by default now uses [PyTorch 2.0](https://pytorch.org/get-started/pytorch-2.0/). At the time of writing (Dec...

Thanks for the incredibly lucid GPT implementation! I've started rewriting nanoGPT in Jax/Flax as a test-bed to play with the new `jax.experimental.pjit` API. Thought I'd put it here for anyone...

Hey there, should we remove the @torch.jit.script decorator for the `fused_gelu` activation function *if* we compile the model? I benchmarked both and found the inference time to be substantially faster...

This PR logs the parameters living on the global scope as a config to W&B. This is a greedy solution, as the parameters live on the global scope. I do...

This is a small contribution on some things that I found interesting to take into consideration: - **requirements.txt**: provide an easy way to create the environment. - **.gitignore**: ignore the...

*Edit: Here is my Python Version and packages list, including NVIDIA CUDA info. Python 3.8.10 aiohttp==3.8.3 aiosignal==1.3.1 async-timeout==4.0.2 attrs==22.2.0 blobfile==2.0.0 certifi==2022.12.7 charset-normalizer==2.1.1 colorama==0.4.6 datasets==2.8.0 dill==0.3.6 filelock==3.9.0 frozenlist==1.3.3 fsspec==2022.11.0 huggingface-hub==0.11.1 idna==3.4...