gpt4all
gpt4all copied to clipboard
Replit Model
Describe your changes
- script to convert hugging face replit model to ggml
- ggml replit model backend + llmodel_c library integration
- Python bindings for replit model (currently this does not work due to whitespace parsing)
Issue ticket number and link
Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] I have added thorough documentation for my code.
- [ ] I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
- [ ] If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.
Demo
How did you resolve the prompt template issue? I'm wondering if we want this to go in before or after Tuxifan's dlopen_backend change. Both will require some change to accomadate the other.
When is the PR going to merge? It will be nice to have a local version for that.
@jpzhangvincent there's a large PR consolidating ggml versions currently in the process of being merged into main. This will be going in after that :)
@manyoso @niansa ready for another review
trying to test locally on top of main - doesn't rebase on main cleanly
sorry, the build errors aren't windows specific, main still seems to have gotten away from this again - there are conflicts (that didn't become actual merge conflicts) with the prompt() deduplication in https://github.com/nomic-ai/gpt4all/pull/822 - Replit needs to implement tokenize and tokenToString now https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-backend/llmodel.h#L88-L89
@apage43 Fixed - how does it look now?
Edit: prefacing this with I'm not a C++ developer :'). So let me know if there are better practices / concerns I should be watching out for here - have just been trying to match other model backends as closely as possible.
Did run python bindings to make sure all the changes are working.
Tried to convert and run the model locally with the included script, getting a failure to load - is the conversion script out of sync?
replit_model_load: loading model from '/Users/aaron/Library/Application Support/nomic.ai/GPT4All//ggml-replit-code-v1-3b-f16.bin' - please wait ...
replit_model_load: n_vocab = 32768
replit_model_load: n_ctx = 2048
replit_model_load: n_embd = 2560
replit_model_load: n_head = 32
replit_model_load: n_layer = 32
replit_model_load: ftype = 1
replit_model_load: qntvr = 0
replit_model_load: ggml ctx size = 5600.73 MB
replit_model_load: memory_size = 640.00 MB, n_mem = 65536
replit_model_load: unknown tensor 'transformer.blocks.0.norm_1.weight' in model file
replit_model_load: Replit ERROR: failed to load model from /Users/aaron/Library/Application Support/nomic.ai/GPT4All//ggml-replit-code-v1-3b-f16.bin
Hmm just ran conversion script again and is working fine for me. Which OS are you working on?
Hmm just ran conversion script again and is working fine for me. Which OS are you working on?
MacOS, I checked out your branch and ran
python ./gpt4all-backend/scripts/convert_replit_hf_to_ggml.py ~/model/replit-code-v1-3b 1
where ~/model/replit-code-v1-3b is from a clean git clone https://huggingface.co/replit/replit-code-v1-3b
@rguo123 is there a chance that you have an old version of the Replit model? Seems like they updated it to be in the same format as MPT in the last few weeks or so: https://huggingface.co/replit/replit-code-v1-3b/discussions/17
@zanussbaum ah rip that is probably it. Testing it right now and added a comment specifying the huggingface model version to use with our convert script / model backend.
This likely requires a pretty sizable overhaul of the convert and backend scripts to get it working again. Better if we can somehow consolidate it with MPT backend and just have options to swap out the tokenizer? I made a full feature request for this here: https://github.com/nomic-ai/gpt4all/issues/878.
Due to interest for a code model, I think it would be best if we proceed with this PR @apage43 and address this in near future unless you see an easy consolidation plan?
ah nuts - alright, I'm fetching the old version to test - I do think it'd be good to get a code model sooner than later but I also think the "instruct" variant you linked will be a lot more suitable for use in the UI than the original the since UI is pretty tied to the chat form, this one I expect will be most useful via bindings
I also don't want to have to keep code around to support two versions of the same model so I hope if we do manage to consolidate them we can deprecate the old model file