nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Results 297 nanoGPT issues
Sort by recently updated
recently updated
newest added

Hi, I was using the sample.py file for running inference. While it works fine on a single GPU, I wanted to know how I can run it across multiple GPUs?

Hi I can run train.py on a single gpu, but not on more than one. I have 2 gpus, and if i run the command ` torchrun --standalone --nproc_per_node=2 train.py...

Hello, I have added the link to my repository https://github.com/AayushSameerShah/Neural-Net-Zero-to-Hero-with-Andrej as a **reference link** which follows all *spelled-out* codes with visualizations and simple step-by-step explanation from the Zero-to-hero lecture series....

I tried out the code for the Shakespeare data, and it worked exactly as expected. But then I tried the code with my own data (a text file with 2000...

This patch adds streaming of generated tokens in `sample.py`, to provide a better user experience. This is particularly useful for weak machines where generating full output may take a significant...

This patch adds a feature which allows users to print generated samples during the training process: ```bash python3 train.py config/train_shakespeare_char.py --eval_interval=100 --output_sample=True ... step 100: train loss 2.5267, val loss...

As mentioned in #357, the readme has unnecessary hearts across the installs, removed them

The repository has code that requires formatting for easier readability The main changes to the code are: * Double indentation for comment * Added whitespace when getting a slice from...

This PR implements a set of encoder/decoder classes that provide a consistent interface for encoding and decoding text using different schemes like character or BPE encoding. **The main changes:** -...

I recently started working with nanoGPT(a week ago) and so far I am very satisfied with the results , however I would really like to load all of the generated...