nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Results 393 nanoGPT issues
Sort by recently updated
recently updated
newest added

I'm trying to use nanoGPT to generate Python code, and I don't find a stop words implementation in the code right now, so what I'm getting is this: ``` Write...

Andrej, thank you for everything -- you have been a great teacher to many of us. Please feel free to close this issue now :)

Hello, Thank so much for creating this awesome tool for the open-source! Adding support for logging with Comet! Here is my [comet project](https://www.comet.com/sherpan/owt?shareable=buq5HRtE6u54vqSCSl7V06Gpe) as a POC. Let me know if...

Can this approach be used to create a nano-sized `text-davinci-003`?

Not sure what's on your mind on how passed-in params should interact with checkpoint params, however, it seems logical to copy the checkpoint params to model params when resuming the...

I received the following error while running `python prepare.py`: `Traceback (most recent call last): File "C:\Users\fresh\AppData\Roaming\Python\Python39\site-packages\datasets\builder.py", line 1570, in _prepare_split_single for key, record in generator: File "C:\Users\fresh\.cache\huggingface\modules\datasets_modules\datasets\openwebtext\85b3ae7051d2d72e7c5fdf6dfb462603aaa26e9ed506202bf3a24d261c6c40a1\openwebtext.py", line 85, in...

The last outputs are " Downloading builder script: 2.86kB [00:00, 3.19MB/s] Downloading builder script: 2.86kB [00:00, 3.01MB/s] Downloading builder script: 2.86kB [00:00, 3.07MB/s] Downloading builder script: 2.86kB [00:00, 2.45MB/s] Downloading...

Andrej, I taught myself most of what I know about ML by copying your code, trying to understand every line, and then hacking it into something new of my own....

Can you give an example of how to use the official GPT-2 model. I downloaded it successfully via https://raw.githubusercontent.com/openai/gpt-2/master/download_model.py Moved and renamed the model.ckpt.data-00000-of-00001 to /out/ckpt.pt But I got some...