TinyLlama Reference for pretraining other small language models

Reference for pretraining other small language models

Open kmn1024 opened this issue 11 months ago • 1 comments

The README mentions this codebase can act as a "reference for enthusiasts keen on pretraining language models under 5 billion parameters". I'm wondering if you could give a brief guide on how to do so, assuming we start from a transformers config and tokenizer. Something like:

{
  "architectures": [
    "..."
  ],
...
  "model_type": "...",
  "num_hidden_layers": 12,
...
}

Is a lot of work required to change the codebase to support this?

Mar 06 '24 08:03 kmn1024

https://github.com/jzhang38/TinyLlama/blob/main/lit_gpt/config.py

You can pick one config from here or create your own config.

Mar 11 '24 04:03 jzhang38

TinyLlama TinyLlama copied to clipboard

Reference for pretraining other small language models

TinyLlama
TinyLlama copied to clipboard