gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

Latest python binding producing random/garbage output for example on Apple M1 Pro

Open mc-borscht opened this issue 1 year ago • 4 comments

Describe the bug Following installation, chat_completion is producing responses with garbage output on Apple M1 Pro with python 3.11.2

To Reproduce Steps to reproduce the behavior:

  1. pip3 install gpt4all
  2. Run following sample from https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/README.md
from gpt4all import GPT4All
gptj = GPT4All("ggml-gpt4all-j-v1.3-groovy")
messages = [{"role": "user", "content": "Name 3 colors"}]
response = gptj.chat_completion(messages)

Expected behavior Ungarbled content from response

Output

>>> gptj = gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy")
Found model file.
gptj_model_load: loading model from '/Users/gbJamMil/.cache/gpt4all/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 5401.45 MB
gptj_model_load: kv self size  =  896.00 MB
gptj_model_load:  done
gptj_model_load: model size =   123.05 MB / num tensors = 1
>>> messages = [{"role": "user", "content": "Name 3 colors"}]
>>> gptj.chat_completion(messages)
### Instruction:
            The prompt below is a question to answer, a task to complete, or a conversation
            to respond to; decide which and write an appropriate response.

### Prompt:
Name 3 colors
### Response:
#+D988F"G/',8:>&!1>GA8AD,D772,"3)*?'!;2+34""F7H37*#33F,"1=--#3"2A'4)%1?*A<A;<373-D,:"<,3&E8E9->12/EDG*A#35'2.6&=9A),<,*"?84#5F'C
{'model': 'ggml-gpt4all-j-v1.3-groovy', 'usage': {'prompt_tokens': 239, 'completion_tokens': 128, 'total_tokens': 367}, 'choices': [{'message': {'role': 'assistant', 'content': '#+D988F"G/\',8:>&!1>GA8AD,D772,"3)*?\'!;2+34""F7H37*#33F,"1=--#3"2A\'4)%1?*A<A;<373-D,:"<,3&E8E9->12/EDG*A#35\'2.6&=9A),<,*"?84#5F\'C'}}]}

Desktop (please complete the following information):**

  • OS: macOS 13.0 / Darwin 22.1.0
  • Browser: Firefox

mc-borscht avatar May 15 '23 13:05 mc-borscht

First time I ran it, the download failed, resulting in corrupted .bin file in my ~/.cache/gpt4all/ folder. When I ran it again, it didn't try to download it seemed to attempt to generate responses using the corrupted .bin file.

I simply removed the bin file and ran it again, forcing it to re-download the model.

mortenp1337 avatar May 15 '23 18:05 mortenp1337

@manyoso we should check the md5 hashes prior to allowing model execution

AndriyMulyar avatar May 16 '23 10:05 AndriyMulyar

@manyoso we should check the md5 hashes prior to allowing model execution

Better yet, check the SHA-256 at the very least. MD5 is past its "best before" date.

imaami avatar May 17 '23 21:05 imaami

I'd agree, MD5 isn't ideal, but for file checksums it's still fine.

For integrity checks, a state-of-the-art crypto hash isn't really the top priority. It's more important for it to run fast, especially with big files like in this case. Also, I'm not sure if it's a good idea to checksum the whole file every time it's loaded. Maybe a "chunking" approach would be better, as recommended e.g. here: https://stackoverflow.com/questions/1177607/what-is-the-fastest-way-to-create-a-checksum-for-large-files-in-c-sharp/1177654#1177654

By the way, how is it done in the Qt chat application? Isn't there a way to expose that logic to the bindings? It's probably best to avoid duplicating that.

cosmic-snow avatar May 25 '23 17:05 cosmic-snow

Stale, please open a new, updated issue if this is still relevant to you. You're encouraged to open new issues or even PRs if there's anything you need!

niansa avatar Aug 11 '23 12:08 niansa