ggml icon indicating copy to clipboard operation
ggml copied to clipboard

[Feature Request] Support for GALACTICA & EleutherAI Neo & Neo X models

Open trholding opened this issue 2 years ago • 7 comments

Support for the GALACTICA & EleutherAI Neo, Neo X models would be an awesome addition.

GALACTICA seems like a ChatGPT for scientific stuff.

Info: https://galactica.org/ https://huggingface.co/facebook/galactica-120b https://the-decoder.com/galactica-is-an-open-source-language-model-for-scientific-progress/

https://huggingface.co/EleutherAI/gpt-neo-125M https://huggingface.co/EleutherAI/gpt-neo-1.3B https://huggingface.co/EleutherAI/gpt-neo-2.7B https://huggingface.co/EleutherAI/gpt-neox-20b

trholding avatar Dec 12 '22 06:12 trholding

Tried converting:

The error starts with:

python3 convert-h5-to-ggml.py galactica-1.3b/
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:58 in <module>                 │
│                                                                                                  │
│    55 dir_model = sys.argv[1]                                                                    │
│    56 fname_out = sys.argv[1] + "/ggml-model.bin"                                                │
│    57                                                                                            │
│ ❱  58 with open(dir_model + "/vocab.json", "r") as f:                                            │
│    59 │   encoder = json.load(f)                                                                 │
│    60                                                                                            │
│    61 with open(dir_model + "/added_tokens.json", "r") as f:                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory: 'galactica-1.3b//vocab.json'


To further test/fuzz, I just added a vocab.json from GPT-JT

python3 convert-h5-to-ggml.py galactica-1.3b/
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:61 in <module>                 │
│                                                                                                  │
│    58 with open(dir_model + "/vocab.json", "r") as f:                                            │
│    59 │   encoder = json.load(f)                                                                 │
│    60                                                                                            │
│ ❱  61 with open(dir_model + "/added_tokens.json", "r") as f:                                     │
│    62 │   encoder_added = json.load(f)                                                           │
│    63                                                                                            │
│    64 with open(dir_model + "/config.json", "r") as f:                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory: 'galactica-1.3b//added_tokens.json'

Added added_tokens.json to fuzz / find hints what would be needed for conversion in future:

python3 convert-h5-to-ggml.py galactica-1.3b/
You are using a model of type opt to instantiate a model of type gptj. This is not supported for all configurations of models and can yield errors.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:73 in <module>                 │
│                                                                                                  │
│    70 │   use_f16 = False                                                                        │
│    71 │   fname_out = sys.argv[1] + "/ggml-model-f32.bin"                                        │
│    72                                                                                            │
│ ❱  73 model = GPTJForCausalLM.from_pretrained(dir_model, low_cpu_mem_usage=True)                 │
│    74 #print (model)                                                                             │
│    75                                                                                            │
│    76 list_vars = model.state_dict()                                                             │
│                                                                                                  │
│ /home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2379 in          │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   2376 │   │   │   │   mismatched_keys,                                                          │
│   2377 │   │   │   │   offload_index,                                                            │
│   2378 │   │   │   │   error_msgs,                                                               │
│ ❱ 2379 │   │   │   ) = cls._load_pretrained_model(                                               │
│   2380 │   │   │   │   model,                                                                    │
│   2381 │   │   │   │   state_dict,                                                               │
│   2382 │   │   │   │   loaded_state_dict_keys,  # XXX: rename?                                   │
│                                                                                                  │
│ /home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2512 in          │
│ _load_pretrained_model                                                                           │
│                                                                                                  │
│   2509 │   │   │   for key in missing_keys:                                                      │
│   2510 │   │   │   │   if key.startswith(prefix):                                                │
│   2511 │   │   │   │   │   key = ".".join(key.split(".")[1:])                                    │
│ ❱ 2512 │   │   │   │   param = model_state_dict[key]                                             │
│   2513 │   │   │   │   if param.device == torch.device("meta"):                                  │
│   2514 │   │   │   │   │   if not load_in_8bit:                                                  │
│   2515 │   │   │   │   │   │   set_module_tensor_to_device(model, key, "cpu", torch.empty(*para  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'h.0.mlp.fc_in.bias'

Obviously there needs to be valid vocab and added_token files... I am figuring out how galactica works...

trholding avatar Dec 12 '22 11:12 trholding

@ggerganov

I get similar errors as above when trying to convert neox 20b. How do I create the added_tokens.json file?

python3 convert-h5-to-ggml.py gpt-neox-20b/
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:61 in <module>                 │
│                                                                                                  │
│    58 with open(dir_model + "/vocab.json", "r") as f:                                            │
│    59 │   encoder = json.load(f)                                                                 │
│    60                                                                                            │
│ ❱  61 with open(dir_model + "/added_tokens.json", "r") as f:                                     │
│    62 │   encoder_added = json.load(f)                                                           │
│    63                                                                                            │
│    64 with open(dir_model + "/config.json", "r") as f:                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory: 'gpt-neox-20b//added_tokens.json'

EDIT:

Did a quick hack so that added_tokens.json is not required:

https://github.com/trholding/ggml/blob/master/examples/gpt-j/convert-h5-to-ggml.py

I get this error:

python3 convert-h5-to-ggml.py gpt-neox-20b/
You are using a model of type gpt_neox to instantiate a model of type gptj. This is not supported for all configurations of models and can yield errors.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:73 in <module>                 │
│                                                                                                  │
│    70 │   use_f16 = False                                                                        │
│    71 │   fname_out = sys.argv[1] + "/ggml-model-f32.bin"                                        │
│    72                                                                                            │
│ ❱  73 model = GPTJForCausalLM.from_pretrained(dir_model, low_cpu_mem_usage=True)                 │
│    74 #print (model)                                                                             │
│    75                                                                                            │
│    76 list_vars = model.state_dict()                                                             │
│                                                                                                  │
│ /home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2379 in          │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   2376 │   │   │   │   mismatched_keys,                                                          │
│   2377 │   │   │   │   offload_index,                                                            │
│   2378 │   │   │   │   error_msgs,                                                               │
│ ❱ 2379 │   │   │   ) = cls._load_pretrained_model(                                               │
│   2380 │   │   │   │   model,                                                                    │
│   2381 │   │   │   │   state_dict,                                                               │
│   2382 │   │   │   │   loaded_state_dict_keys,  # XXX: rename?                                   │
│                                                                                                  │
│ /home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2512 in          │
│ _load_pretrained_model                                                                           │
│                                                                                                  │
│   2509 │   │   │   for key in missing_keys:                                                      │
│   2510 │   │   │   │   if key.startswith(prefix):                                                │
│   2511 │   │   │   │   │   key = ".".join(key.split(".")[1:])                                    │
│ ❱ 2512 │   │   │   │   param = model_state_dict[key]                                             │
│   2513 │   │   │   │   if param.device == torch.device("meta"):                                  │
│   2514 │   │   │   │   │   if not load_in_8bit:                                                  │
│   2515 │   │   │   │   │   │   set_module_tensor_to_device(model, key, "cpu", torch.empty(*para  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'h.43.attn.q_proj.weight'

I suppose there is no point in editing configs as neox model is different at the model level I guess.

EDIT

I after a small change the neox conversion seemed to work, but OOM killed it. So I changed it again to make it suitable to convert neo 125 m, but again I got a error:

python3 convert-h5-to-ggml.py gpt-neo-125M/
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/TRIPFS/SPACE/ggml/build/examples/gpt-j/convert-h5-to-ggml.py:83 in <module>                 │
│                                                                                                  │
│    80                                                                                            │
│    81 fout.write(struct.pack("i", 0x67676d6c)) # magic: ggml in hex                              │
│    82 fout.write(struct.pack("i", hparams["vocab_size"]))                                        │
│ ❱  83 fout.write(struct.pack("i", hparams["n_positions"]))                                       │
│    84 fout.write(struct.pack("i", hparams["n_embd"]))                                            │
│    85 fout.write(struct.pack("i", hparams["n_head"]))                                            │
│    86 fout.write(struct.pack("i", hparams["n_layer"]))                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'n_positions'

I did some blind changes:

https://github.com/trholding/ggml/commits/master/examples/gpt-j/convert-h5-to-ggml.py

Model gets converted but in the end results in this error:

./bin/gpt-j -m models/gpt-neo-125M/ggml-model.bin -p "This is an example"
main: seed = 1670854940
gptj_model_load: loading model from 'models/gpt-neo-125M/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50257
gptj_model_load: n_ctx   = 768
gptj_model_load: n_embd  = 2048
gptj_model_load: n_head  = 12
gptj_model_load: n_layer = 12
gptj_model_load: n_rot   = 256
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 1689.53 MB
gptj_model_load: memory_size =   144.00 MB, n_mem = 9216
gptj_model_load: tensor 'transformer.wte.weight' has wrong size in model file
main: failed to load model from 'models/gpt-neo-125M/ggml-model.bin'

I think I should give up... I have probably no idea what I am doing...

trholding avatar Dec 12 '22 13:12 trholding

Yeah, these models probably have different architecture compared to GPT-J, so it is not just a matter of converting the data. You have to also implement the missing layers and connect them correctly. Also there are probably some differences in the tokenizer.

Every model can be ported to ggml, but it requires some work. I guess it would be better if I try to make the codebase easier to understand and document it. This way other people might wish to contribute. Otherwise, it's too much work for a single developer.

ggerganov avatar Dec 12 '22 18:12 ggerganov

Every model can be ported to ggml, but it requires some work. I guess it would be better if I try to make the codebase easier to understand and document it. This way other people might wish to contribute. Otherwise, it's too much work for a single developer.

Agreed, and a nice documentation with a Howto would be awesome :)

trholding avatar Dec 12 '22 19:12 trholding

Hello, any updates on where this is? I am interested in working to port this model.

reshinthadithyan avatar Mar 19 '23 06:03 reshinthadithyan

Looks like I won't have time to look into these in the near future. There are other more interesting models to port -- see the README

ggerganov avatar Mar 22 '23 19:03 ggerganov

Refer https://github.com/NolanoOrg/cformers/ -> We have added GPT-J, BLOOM and support for NeoX models. Please open an issue on what pre-compressed models would you want to be ported (with a python interface)

Ayushk4 avatar Mar 22 '23 20:03 Ayushk4