nanoGPT issues

What is nanoGPT and how to use it?

1

Hi Sorry to ask, but how can I use this? How can I provide training sets?

sudo-sand

is there a google colab/ jupyter notebook implimentation of this project ?

2

SadafShafi

Using float16 via Gradscaler

1

https://huggingface.co/facebook/opt-30b Larger open source model? Does it work?

acheong08

Use argparse in configurator.py

1

I really liked the simplicity of the globals() approach, this is one small improvement that adds argparse support, which gives a few things for free: * `python train.py -h` now...

plotguy

Training on M1 "MPS"

46

Most of the people do not have access to 8XA100 40GB systems. But a single M1 Max laptop with 64 GB memory could host the training. How difficult is it...

okpatil4u

I explored the functionalities of prepare.py on my own and prepared a post in Spanish

1

I tried to replicate the code on my laptop and I ran into many obstacles! After reading the code carefully, I realized that it had many conceptual gaps. So I...

lzeladam

Tie LM Head Weight to Token Embedding to match official GPT2 Code

8

This PR updates the GPT2 lm_head weight by linking it to the token embedding weights. This is done in the official GPT2 TF implementation [here](https://github.com/openai/gpt-2/blob/master/src/model.py#L171).

fattorib

Add gradient accumulation support

2

Enables training with larger effective batch sizes by taking multiple steps between gradient updates. I've always found this useful since batch size correlates strongly with performance even for small models...

VHellendoorn

Make wandb training logs public

1

Hello, again Andrej, Would you mind making the training logs public so we can follow your progress in reproducing GPT2? You can do this by clicking on the lock on...

tcapelle

Hardware requirements for inference?

1

I see the documentation on the hardware requirements for training. Any thoughts on what the requirements for inference are? Thank you!

jjtolton

nanoGPT
nanoGPT copied to clipboard

Metadata

What is nanoGPT and how to use it?

is there a google colab/ jupyter notebook implimentation of this project ?

Using float16 via Gradscaler

Use argparse in configurator.py

Training on M1 "MPS"

I explored the functionalities of prepare.py on my own and prepared a post in Spanish

Tie LM Head Weight to Token Embedding to match official GPT2 Code

Add gradient accumulation support

Make wandb training logs public

Hardware requirements for inference?

← Metadata

Owner

Metadata

nanoGPT nanoGPT copied to clipboard

Metadata

← Metadata

Owner

Metadata

nanoGPT
nanoGPT copied to clipboard