gpt4all
gpt4all copied to clipboard

Published 20 hours ago •

Reame
Issues

Vulkan: Meta-Llama-3.1-8b-128k slow generation.

Open 3Simplex opened this issue 6 months ago • 12 comments

[!NOTE] Until this is fixed the workaround is use the CPU or CUDA instead.

Bug Report

Vulkan: Meta-Llama-3.1-8b-128k slow generation.

When using release 3.1.1 and Vulkan the Meta-Llama-3.1-8b-128k is extremely slow. (1.5t/s) This is not a problem on CPU.

Steps to Reproduce

Using GPT4All 3.1.1 with Vulkan
Chat with Meta-Llama-3.1-8b-128k
Speed Is immediately slow (1.5t/s)

Expected Behavior

Using the model with llama.cpp directly reports over 60t/s Using the model with GPT4All before 3.1.1 I could get about 30t/s.

Your Environment

GPT4All version: 3.1.1 (release or web_beta)
Operating System: Windows
Chat model used (if applicable): Vulkan & Meta-Llama-3.1-8b-128k

Jul 29 '24 16:07 3Simplex

Labels

bug

chat

vulkan

Owner

Other Repo Issues