gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

Gibberish output on Quadro (Maxwell *and* Turing?)

Open MMirabito opened this issue 1 year ago • 6 comments

Bug Report

Does anyone know why GPT4All will respond in gibberish. The behavior is inconsistent

Steps to Reproduce

  1. Start GPT4All selected model “Nous Hermes Mistral DPO”
  2. Enter the prompt “Sample Java program”.
  3. The response is legible.
    
  4. Enter the prompt “Sample R program”.
  5. The response is gibberish.

Expected Behavior

The response should not be gibberish it is currently not legible

Your Environment

  • GPT4All version: GPT4All V2.7.3

  • Operating System: Win 10 Enterprise

  • Chat model used (if applicable): Nous Hermes Mistral DPO but it does not matter

  • Intel(R) Xeon(R) W-10885M CPU @ 2.40GHz, 2400 Mhz, 8 Core(s), 16 Logical Processor(s)

  • 128 GB RAM

  • GPU Quadro RTX 3000 6 GB RAM

  • See screenshot 2024-03-20_11-15-34

MMirabito avatar Mar 20 '24 15:03 MMirabito

Unfortunately, I cannot reproduce the issue on my Tesla P40: Screenshot from 2024-03-22 12-39-32

Have you tried a simple question/answer chat on CPU (e.g. asking the model to do basic math), and compared it to the same session on GPU?

cebtenzzre avatar Mar 22 '24 16:03 cebtenzzre

Hi @cebtenzzre, sorry for my late reply, I just noticed your post.

Perhaps I am an edge case. This is the new prompt as you suggested do this math 12 X 12 then take the result and multiply by 2 then divide by 2

  • I am seeing the same issue, it is just spitting out nonsense for 10 minutes or so
  • I think it's doing GPU however, I am not sure - in app. settings, I said 8 CPU Threads
  • Restating the application does not help. I need to do a hard reboot. Then it behaves for a little bit but then it goes sideways again
  • Also upon reboot, I can get anywhere from 9-18 tokens a second. It eventually drops to 2-4 tokens a second

Thanks in advance for any ideas, max

See screenshots:

2024-03-31_06-04-32

2024-03-31_05-58-39

2024-03-31_06-03-02

MMirabito avatar Mar 31 '24 10:03 MMirabito

On the settings page, try setting the device to CPU. I don't believe anyone at Nomic has tried our Vulkan backend on a Quadro card.

Screenshot from 2024-04-01 15-48-29

cebtenzzre avatar Apr 01 '24 19:04 cebtenzzre

Hi @cebtenzzre, setting it to CPU is doing better - thank you for the suggestion - token speed between 5 and 6.3 a second not super fast but workable during the test. CPU maxed out at 100%

Take a look at my prompts (many typos but it's able to figure out what I am asking for)

Thank again, max

2024-04-02_07-40-56 2024-04-02_07-45-37 2024-04-02_07-42-00

MMirabito avatar Apr 02 '24 11:04 MMirabito

Reopening as the underlying issue has not been fixed.

cebtenzzre avatar Apr 02 '24 15:04 cebtenzzre

I too have a Quadro M4000 card. The same gibberish outputs happen at random. In my case it is just random line noise. Unicode and other symbols at lot of times. I have found that switching between a 4 Q and 8 Q model will 'reset' the GPU so that you can get back to using it. Also it is not a context window thing, since when the context window resets it does not happen. Almost always in 1 to 3 prompts it will happen.

Have further found that turning off 'save chats to disk' helps in this regard.

CurtiusSimplus avatar Apr 06 '24 01:04 CurtiusSimplus

Bug Report Response is gibberish when using longer prompts with GPU (NVIDIA Quadro K6000). Happens with any model. No problems when using CPU.

Steps to Reproduce

  • Select "Vulkan: Quadro K6000" (CUDA is installed but not selectable)

  • Type "You have access to the comprehensive wiki of a company, which includes information about the company’s history, products, services, policies, and more. Your role is to assist the employees by answering their questions related to the company. You should provide accurate, concise, and helpful responses based on the information available in the company’s wiki."

  • Response (is always changing but stays gibberish): IIIinI#IiIinIinII#IIII#IIIIIIIIIIIIIIIIII#IIIIIIII#inII#III##IIIIIIIIIinIIIIIiIIIIIIIII#IIIIIIII#IIIIinIinIIIIIIIIIIIIIIIIIIIIIIIIIIIIin#inIIIIIIIIIiII#III#IIII#IIII#IIIinIIIIIIIIIIIII#IIIinIIIIIIIIIIIIIIIIIIIIinIIIIiIIIII#IIIIII#I#I#II#IIII#IIIIIIIIIIin##II#inIIIinI##Iin

GPT4All version: 2.8.0 Operating System: Windows Server 2022 Chat model used (if applicable): any

LLMuser avatar May 31 '24 14:05 LLMuser