nanoGPT
nanoGPT copied to clipboard
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Thanks for this project! I'm currently training the small version on openwebtext with 8 x A100 gpus, using torch 2.0 (nigthly). The data is local to the instance and the...
Hi, I'm an engineer from China. This is a wonderful project. I want to know whether I can use this project to train a model for Chinese or not, thanks!
https://github.com/yuedajiong/gpt/blob/main/gpt-add.py
I'm writing a script for those with bad network. ```bash wget https://zenodo.org/record/3834942/files/openwebtext.tar.xz # As many times as needed wget -c https://zenodo.org/record/3834942/files/openwebtext.tar.xz ``` Is this the correct URI? Should I make...
![image](https://user-images.githubusercontent.com/73870365/216515907-0ee76f9a-e5b8-4775-84bf-a2841b53bdc6.png)
Never got train.bin. Always failed after tiktoken.
NotImplementedError: Could not run 'aten::_pin_memory' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process...
* Fixed error "AttributeError: module 'numpy' has no attribute 'typeDict' * Fixed error "TypeError: Descriptors cannot not be created directly."
At Lightning we are maintaining a version of this repo that uses [Lightning Fabric](https://pytorch-lightning.readthedocs.io/en/stable/fabric/fabric.html) (a new lightweight training library), under the hood. Changes to the codebase are very minimal, so...