Jai Mu comments

Results 7 comments of


                                            Jai Mu

Checkpoint not generating

Assuming you're using the train.py from [nsheppard's fork](https://github.com/nshepperd/gpt-2/), try running it with `--save_every N` where N is the number of steps before it auto-saves (default 1000). For example: `python train.py...

Checkpoint not generating

Did you change the "data.npz" to point to where your dataset is? Or better yet, try running the same train.py command as in the original post and just add `--save_every...

gpt2 translation task

Make sure you're properly formatting the data with and between samples otherwise it will think that it's one continuous stream and should continue like that.

OOM on 345M with GPU

On a single 1070? I don't think that's possible. I'm currently training a 10GB dataset using the 345M model on a 3090 and it's using ~17GB on VRAM ![345MTraining](https://user-images.githubusercontent.com/14964859/108821213-cc08af00-7604-11eb-88c3-dfa4ed16fe81.PNG)

Jai Mu

Checkpoint not generating

Checkpoint not generating

gpt2 translation task

OOM on 345M with GPU

OOM on 345M with GPU

Can't Train On Second GPU (Ubuntu 18.04)

fine tuning GPT-J 6B?