gpt-neox Update documentation

First of all, thank you for this repo. It allowed me to start training a GPT model from random initialization.

A couple of things that I have noticed:

The README.md is out of date in that the option "--keep-newlines" is not available on the preprocessing script.
The documentation refers to finetuning a few times but there is no mention of how to perform finetuning from GPT-Neo or GPT-J.
Some additional documentation about how the different size models in config/ relate to the different GPT-3 sizes or GPT-Neo/J would be helpful.
the datapath for the configs in the README.md do not have the "_text_document" that is appended by the preprocessing script.

Feb 07 '22 02:02 ayl

The README.md is out of date in that the option "--keep-newlines" is not available on the preprocessing script.

Thank you for pointing this out.

The documentation refers to finetuning a few times but there is no mention of how to perform finetuning from GPT-Neo or GPT-J.

Finetuning is mechanistically the same as training, just called train.py.

Some additional documentation about how the different size models in config/ relate to the different GPT-3 sizes or GPT-Neo/J would be helpful.

We use the same naming convention as GPT-3. If you want to train the model referred to as GPT-3 Large in the paper you use configs/large.yml. To train a model the same size as GPT-Neo, select the file configs/2-7b.yml as GPT-Neo is a 2.7B parameter model.

the datapath for the configs in the README.md do not have the "_text_document" that is appended by the preprocessing script.

Good idea, this should be explicitly discussed.

Feb 10 '22 14:02 StellaAthena

@StellaAthena If no one is working on it. Then I would love to work on it.

May 13 '22 07:05 divyanshugit

@StellaAthena If no one is working on it. Then I would love to work on it.

@divyanshugit If these changes haven't already been merged then yes, nobody is working on it. Feel free to pick it up :)

May 13 '22 22:05 StellaAthena

gpt-neox gpt-neox copied to clipboard

Update documentation

gpt-neox
gpt-neox copied to clipboard