gpt4all
gpt4all copied to clipboard
how to convert it to the new ggml format
- First Get the gpt4all model.
- you need install pyllamacpp, how to install
- download llama_tokenizer Get
- Convert it to the new ggml format
this is the one that has been converted : here
with this simple command
pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin
now you can add to :
from pyllamacpp.model import Model
def new_text_callback(text: str):
print(text, end="")
model = Model(ggml_model='./path/to/gpt4all-converted.bin', n_ctx=512)
generated_text = model.generate("Once upon a time, ", n_predict=55)
print(generated_text)
@FosterG4 Can you please share the converted model?
Seconded.
Guys, you can follow this https://blog.ouseful.info/2023/04/04/running-gpt4all-on-a-mac-using-python-langchain-in-a-jupyter-notebook I've tested in my MacBook Pro, it's working perfectly.
@FosterG4 Can you please share the converted model?
yes here, you can download it from here https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin
I know llama.cpp is designed for cpu in mind, but is there Python library to run quantized GGML models on Colab with GPU for faster result?
quantized GGML
you can try with this https://bellard.org/ts_server/
thank you @FosterG4 ! but seems the effect is not as good as original gpt4all-lora-quantized.bin