gpt4all
gpt4all copied to clipboard

Published 20 hours ago •

Reame
Issues

how to convert it to the new ggml format

Open FosterG4 opened this issue 1 year ago • 7 comments

First Get the gpt4all model.
you need install pyllamacpp, how to install
download llama_tokenizer Get
Convert it to the new ggml format

this is the one that has been converted : here

with this simple command

pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin now you can add to :

from pyllamacpp.model import Model

def new_text_callback(text: str):
    print(text, end="")

model = Model(ggml_model='./path/to/gpt4all-converted.bin', n_ctx=512)
generated_text = model.generate("Once upon a time, ", n_predict=55)
print(generated_text)

Screenshot from 2023-04-04 16-00-06

Apr 04 '23 09:04 FosterG4

@FosterG4 Can you please share the converted model?

Apr 04 '23 19:04 zubairahmed-ai

Seconded.

Apr 05 '23 00:04 ifrit98

Guys, you can follow this https://blog.ouseful.info/2023/04/04/running-gpt4all-on-a-mac-using-python-langchain-in-a-jupyter-notebook I've tested in my MacBook Pro, it's working perfectly.

Apr 05 '23 01:04 SunixLiu

@FosterG4 Can you please share the converted model?

yes here, you can download it from here https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin

Apr 06 '23 08:04 FosterG4

I know llama.cpp is designed for cpu in mind, but is there Python library to run quantized GGML models on Colab with GPU for faster result?

Apr 07 '23 01:04 chigkim

quantized GGML

you can try with this https://bellard.org/ts_server/

Apr 11 '23 14:04 FosterG4

thank you @FosterG4 ! but seems the effect is not as good as original gpt4all-lora-quantized.bin

May 04 '23 11:05 panpan0000