llama.cpp
llama.cpp copied to clipboard
GPT4All-J conversion
Conversion of latest GPT4All-J ggml binary obtained from app installer
3785248281 Apr 14 22:03 models/gpt4all/ggml-gpt4all-j.bin
fails:
./main -m models/gpt4all/ggml-gpt4all-j.bin -t 4 -n 512 --repeat_penalty 1.0 --color -ins -r "User:" -f prompts/reason-act.txt
main: seed = 1681502915
llama_model_load: loading model from 'models/gpt4all/ggml-gpt4all-j.bin' - please wait ...
llama_model_load: invalid model file 'models/gpt4all/ggml-gpt4all-j.bin' (too old, regenerate your model files or convert them with convert-unversioned-ggml-to-ggml.py!)
but running
python convert-unversioned-ggml-to-ggml.py models/gpt4all/ggml-gpt4all-j.bin models/llama/tokenizer.model
I have no temp file generated while running the migration tool from older ggml version to the latest I get
llama.cpp % python migrate-ggml-2023-03-30-pr613.py models/gpt4all/ggml-gpt4all-j.bin models/gpt4all/ggml-gpt4all-j-new.bin
Traceback (most recent call last):
File "/Users/loretoparisi/Documents/Projects/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
main()
File "/Users/loretoparisi/Documents/Projects/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 272, in main
tokens = read_tokens(fin, hparams)
File "/Users/loretoparisi/Documents/Projects/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 133, in read_tokens
word = fin.read(length)
ValueError: read length must be non-negative or -1
There is no convert-unversioned-ggml-to-ggml.py nor migrate-ggml-2023-03-30-pr613.py script anymore. Try running convert.py from master.
@prusnak thank you not sure how to run since the model is a ggml format already. I'm trying
python convert.py -h models/gpt4all/ggml-gpt4all-j.bin --outtype q4_1 --outfile models/gpt4all/ggml-gpt4all-j-new.bin
Apparently this file cannot be read, in fact the dump-single gives
python convert.py models/gpt4all/ggml-gpt4all-j.bin --dump-single
Traceback (most recent call last):
File "/Users/loretoparisi/Documents/Projects/llama.cpp/convert.py", line 1145, in <module>
main()
File "/Users/loretoparisi/Documents/Projects/llama.cpp/convert.py", line 1116, in main
model_plus = lazy_load_file(args.model)
File "/Users/loretoparisi/Documents/Projects/llama.cpp/convert.py", line 850, in lazy_load_file
return lazy_load_ggml_file(fp, path)
File "/Users/loretoparisi/Documents/Projects/llama.cpp/convert.py", line 790, in lazy_load_ggml_file
text = must_read(fp, length)
File "/Users/loretoparisi/Documents/Projects/llama.cpp/convert.py", line 758, in must_read
ret = fp.read(length)
ValueError: read length must be non-negative or -1
Running into the same issue, for what it's worth. @prusnak do you have more details as to how to run this properly?
@comex I see similar issues, is the new convert script supposed to work with the new gpt4all file (ggml-gpt4all-j.bin)?
$ python3 convert.py models/gpt4all-7B/ggml-gpt4all-j.bin
Loading model file models/gpt4all-7B/ggml-gpt4all-j.bin
Traceback (most recent call last):
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 1145, in <module>
main()
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 1125, in main
model_plus = load_some_model(args.model)
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 1052, in load_some_model
models_plus.append(lazy_load_file(path))
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 850, in lazy_load_file
return lazy_load_ggml_file(fp, path)
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 790, in lazy_load_ggml_file
text = must_read(fp, length)
File "/Users/stick/work/ggerganov/llama.cpp/convert.py", line 758, in must_read
ret = fp.read(length)
ValueError: read length must be non-negative or -1
Link to the file is in the readme of this repository: https://github.com/nomic-ai/gpt4all
Is this related? In the updated README for GPT4All-J it also says:
Note this model is only compatible with the C++ bindings found here. It will not work with any existing llama.cpp bindings as we had to do a large fork of llama.cpp
Note this model is only compatible with the C++ bindings found here. It will not work with any existing llama.cpp bindings as we had to do a large fork of llama.cpp
That's a good find! If that is the case we are done here and we'll just adjust the readme to say that gpt4all model files are no longer supported.
Will wait for confirmation by @comex and then proceed with adjustment of PR #980
@prusnak @comex if that's the case is there anything we can do here? I did notice the bindings they link to are GPL-licensed (probably because of the Qt dependency) so can't touch any of the code in that repo, but I'm not sure on how exactly the model files are incompatible with the current implementation here.
cc @ggerganov
@prusnak @comex if that's the case is there anything we can do here? I did notice the bindings they link to are GPL-licensed (probably because of the Qt dependency) so can't touch any of the code in that repo, but I'm not sure on how exactly the model files are incompatible with the current implementation here.
cc @ggerganov
Thank you guys, in my understanding their GUI in Qt is in-fact GNU licenced, while models weights (that are not part of the ui project) are not. According to authors here the model's license is
GPT4All-J: An Apache-2 Licensed GPT4All Model
Thank you guys, in my understanding their GUI in Qt is in-fact GNU licenced, while models weights (that are not part of the ui project) are not.
Yep, the model isn't. But I think the cpp bindings in https://github.com/nomic-ai/gpt4all-chat are also GPL-licensed.
I just updated the license. The GUI is now MIT licensed.
Is this related? In the updated README for GPT4All-J it also says:
Note this model is only compatible with the C++ bindings found here. It will not work with any existing llama.cpp bindings as we had to do a large fork of llama.cpp
This isn't relevant to the new gpt4all-j model
This wouldn't work anyway as gpt4all-j is based on the gpt-j model, which is supported by ggml but not llama.cpp (which is only meant for LLaMA models).
Perhaps it would make sense to add some heuristics to the loading script to detect when a user is trying to use it with an incompatible architecture.