gpt4all Replit Model

Describe your changes

script to convert hugging face replit model to ggml
ggml replit model backend + llmodel_c library integration
Python bindings for replit model (currently this does not work due to whitespace parsing)

Issue ticket number and link

Checklist before requesting a review

[x] I have performed a self-review of my code.
[ ] If it is a core feature, I have added thorough tests.
[ ] I have added thorough documentation for my code.
[ ] I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
[ ] If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Demo

May 24 '23 20:05 rguo123

How did you resolve the prompt template issue? I'm wondering if we want this to go in before or after Tuxifan's dlopen_backend change. Both will require some change to accomadate the other.

May 26 '23 17:05 manyoso

When is the PR going to merge? It will be nice to have a local version for that.

May 31 '23 16:05 jpzhangvincent

@jpzhangvincent there's a large PR consolidating ggml versions currently in the process of being merged into main. This will be going in after that :)

May 31 '23 16:05 rguo123

@manyoso @niansa ready for another review

Jun 05 '23 14:06 rguo123

trying to test locally on top of main - doesn't rebase on main cleanly

Jun 06 '23 02:06 apage43

sorry, the build errors aren't windows specific, main still seems to have gotten away from this again - there are conflicts (that didn't become actual merge conflicts) with the prompt() deduplication in https://github.com/nomic-ai/gpt4all/pull/822 - Replit needs to implement tokenize and tokenToString now https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-backend/llmodel.h#L88-L89

Jun 06 '23 03:06 apage43

@apage43 Fixed - how does it look now?

Edit: prefacing this with I'm not a C++ developer :'). So let me know if there are better practices / concerns I should be watching out for here - have just been trying to match other model backends as closely as possible.

Did run python bindings to make sure all the changes are working.

Jun 06 '23 15:06 rguo123

Tried to convert and run the model locally with the included script, getting a failure to load - is the conversion script out of sync?

replit_model_load: loading model from '/Users/aaron/Library/Application Support/nomic.ai/GPT4All//ggml-replit-code-v1-3b-f16.bin' - please wait ...
replit_model_load: n_vocab    = 32768
replit_model_load: n_ctx      = 2048
replit_model_load: n_embd     = 2560
replit_model_load: n_head     = 32
replit_model_load: n_layer    = 32
replit_model_load: ftype      = 1
replit_model_load: qntvr      = 0
replit_model_load: ggml ctx size = 5600.73 MB
replit_model_load: memory_size =   640.00 MB, n_mem = 65536
replit_model_load: unknown tensor 'transformer.blocks.0.norm_1.weight' in model file
replit_model_load: Replit ERROR: failed to load model from /Users/aaron/Library/Application Support/nomic.ai/GPT4All//ggml-replit-code-v1-3b-f16.bin

Jun 06 '23 16:06 apage43

Hmm just ran conversion script again and is working fine for me. Which OS are you working on?

Jun 06 '23 17:06 rguo123

Hmm just ran conversion script again and is working fine for me. Which OS are you working on?

MacOS, I checked out your branch and ran

python ./gpt4all-backend/scripts/convert_replit_hf_to_ggml.py ~/model/replit-code-v1-3b 1

where ~/model/replit-code-v1-3b is from a clean git clone https://huggingface.co/replit/replit-code-v1-3b

Jun 06 '23 18:06 apage43

@rguo123 is there a chance that you have an old version of the Replit model? Seems like they updated it to be in the same format as MPT in the last few weeks or so: https://huggingface.co/replit/replit-code-v1-3b/discussions/17

Jun 06 '23 18:06 zanussbaum

@zanussbaum ah rip that is probably it. Testing it right now and added a comment specifying the huggingface model version to use with our convert script / model backend.

This likely requires a pretty sizable overhaul of the convert and backend scripts to get it working again. Better if we can somehow consolidate it with MPT backend and just have options to swap out the tokenizer? I made a full feature request for this here: https://github.com/nomic-ai/gpt4all/issues/878.

Due to interest for a code model, I think it would be best if we proceed with this PR @apage43 and address this in near future unless you see an easy consolidation plan?

Jun 06 '23 20:06 rguo123

ah nuts - alright, I'm fetching the old version to test - I do think it'd be good to get a code model sooner than later but I also think the "instruct" variant you linked will be a lot more suitable for use in the UI than the original the since UI is pretty tied to the chat form, this one I expect will be most useful via bindings

I also don't want to have to keep code around to support two versions of the same model so I hope if we do manage to consolidate them we can deprecate the old model file

Jun 06 '23 20:06 apage43

gpt4all gpt4all copied to clipboard

Replit Model

Describe your changes

Issue ticket number and link

Checklist before requesting a review

Demo

gpt4all
gpt4all copied to clipboard