gpt4all
gpt4all copied to clipboard
Vulkan: Meta-Llama-3.1-8b-128k slow generation.
[!NOTE] Until this is fixed the workaround is use the CPU or CUDA instead.
Bug Report
Vulkan: Meta-Llama-3.1-8b-128k slow generation.
When using release 3.1.1 and Vulkan the Meta-Llama-3.1-8b-128k is extremely slow. (1.5t/s) This is not a problem on CPU.
Steps to Reproduce
- Using GPT4All 3.1.1 with Vulkan
- Chat with Meta-Llama-3.1-8b-128k
- Speed Is immediately slow (1.5t/s)
Expected Behavior
Using the model with llama.cpp directly reports over 60t/s Using the model with GPT4All before 3.1.1 I could get about 30t/s.
Your Environment
- GPT4All version: 3.1.1 (release or web_beta)
- Operating System: Windows
- Chat model used (if applicable): Vulkan & Meta-Llama-3.1-8b-128k