gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

how to convert it to the new ggml format

Open FosterG4 opened this issue 1 year ago • 7 comments

  1. First Get the gpt4all model.
  2. you need install pyllamacpp, how to install
  3. download llama_tokenizer Get
  4. Convert it to the new ggml format

this is the one that has been converted : here

with this simple command

pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin now you can add to :

from pyllamacpp.model import Model

def new_text_callback(text: str):
    print(text, end="")

model = Model(ggml_model='./path/to/gpt4all-converted.bin', n_ctx=512)
generated_text = model.generate("Once upon a time, ", n_predict=55)
print(generated_text)

Screenshot from 2023-04-04 16-00-06

FosterG4 avatar Apr 04 '23 09:04 FosterG4

@FosterG4 Can you please share the converted model?

zubairahmed-ai avatar Apr 04 '23 19:04 zubairahmed-ai

Seconded.

ifrit98 avatar Apr 05 '23 00:04 ifrit98

Guys, you can follow this https://blog.ouseful.info/2023/04/04/running-gpt4all-on-a-mac-using-python-langchain-in-a-jupyter-notebook I've tested in my MacBook Pro, it's working perfectly.

SunixLiu avatar Apr 05 '23 01:04 SunixLiu

@FosterG4 Can you please share the converted model?

yes here, you can download it from here https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin

FosterG4 avatar Apr 06 '23 08:04 FosterG4

I know llama.cpp is designed for cpu in mind, but is there Python library to run quantized GGML models on Colab with GPU for faster result?

chigkim avatar Apr 07 '23 01:04 chigkim

quantized GGML

you can try with this https://bellard.org/ts_server/

FosterG4 avatar Apr 11 '23 14:04 FosterG4

thank you @FosterG4 ! but seems the effect is not as good as original gpt4all-lora-quantized.bin

panpan0000 avatar May 04 '23 11:05 panpan0000