Eval bug: Q4_K_M with vulkan generates garbage/repetitive output
Name and Version
version: 6933 (fcfce040e) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
Vulkan
Hardware
i5-1135G7
Models
ggml-org/gemma-3-1b-it-Q4_K_M.gguf
Problem description & steps to reproduce
When I run llama-cli, any prompt gives inappropriate output. Often a short sequence is repeated endlessly. Generated text is unrelated to the prompt.
Note:
- the same model works fine on CPU => vulkan backend problem
- using Q8_0 or f16 models works fine with vulkan backend => Q4 specific problem
- I have the feeling that it depends on previous vulkan run. It may be an init problem.
First Bad Commit
No response
Relevant log output
Examples of truncated generation:
> go
부드럽게 바꾸기 100% 화이트보드
(필요한 경우)
1. 이름
2. 연락처
3. 이메일
4. 프로필 사진
> go
hubulong 2023 + 2024 to a complex system. This would involve many layers and complex interactions. The project would aim to create a truly dynamic system. The complexity is a core element of the project, ensuring the system isn't static.
> go
April 1st 2024 2024 to 2025 [2025] to 2026 - 2027 - 2028 2029 - 2030 2031 - 2032 to 2033 - 2034 - 2035 - 2036 - 2037 - 2038 - 2039 - 2040 - 2041 - 2042 - 2043 - 2044 - 2045 - 2046 - 2047 - 2048 - 2049 - 2050 - 2051 - 2052 - 2053 - 2054 - 2055 - 2056 - 2057 - 2058 - 2059 - 2060 - 2061 - 2062 - 2063 - 2064 - 2065 - 2066 - 2067 - 2068 - 2069
> go
부활 13th 5th 4th 3rd 2nd 1st 0th 9th 8th 7th 6th 5th 4th 3rd 2nd 1st 0th 9th 8th 7th 6th 5th 4th 3rd 2nd 1st 0th 9th 8th 7th 6th 5th 4th 3rd 2nd 1st 0th 9th 8th 7th
> go
부드러운 2780809699
and 36300356589
and 29314271987
and 54752771472
and 38856176395
and 23827691788
> go
’er-ish’s’ to ‘er’s’ + ‘er’s’ and ‘er’s’ to ‘er’s’ + ‘er’s’ and so on to 9 + 1 to 9 + 1.
This is a complex expression and it's not a standard programming concept. It's a method of generating a long string of characters.
Let's try a simple example:
Input: "hello"
Output: "helloworld"
Input: "world"
Output: "helloworld"
Input: "hello"
Output: "helloworld"
> go
부드러운 1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1
> go
부침과 같은 것과 같은 것과 같은 것
와 같은 것.
**설명:**
이것은 매우 복잡하고 난해한 질문이며, 그에 대한 답을 얻는 것은 매우 어려운 일입니다. 특히 복잡한 수학적 개념과 개념에 대한 이해를 필요로 합니다.
**이 질문의 핵심은 다음과 같습니다.**
> go
ensical 92/112/112-2023-9.
9.26. 3.23. 14.21. 13.53. 17.17. 16.20. 13.93. 16.61. 15.11. 17.30. 14.73. 14.13. 14.12. 14.13. 14.13. 14.13. 14.13. 14.13. 14.13. 14.13. 14.13. 14.13. 14.13.
> go
부드럽게 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95% + 95%
Which driver version do you have installed? With Ubuntu 22.04, it could be quite outdated. I think upgrading it (for example with a PPA) and seeing if the issue persists would be a good first step to figure this out.
Vulkan Instance Version: 1.3.239 Not sure this was the version you were looking for. The system is Debian 12.
No, the actual driver version is most relevant. You can get it by running vulkaninfo --summary and posting the output here.
========== VULKANINFO
Vulkan Instance Version: 1.3.239
Instance Extensions: count = 20
VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6
Instance Layers: count = 3
VK_LAYER_INTEL_nullhw INTEL NULL HW 1.1.73 version 1 VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1 VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 version 1
Devices:
GPU0: apiVersion = 1.3.230 driverVersion = 22.3.6 vendorID = 0x8086 deviceID = 0x9a49 deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = Intel(R) Xe Graphics (TGL GT2) driverID = DRIVER_ID_INTEL_OPEN_SOURCE_MESA driverName = Intel open-source Mesa driver driverInfo = Mesa 22.3.6 conformanceVersion = 1.3.0.0 deviceUUID = ff258cf4-4865-a82b-e58f-77bffa8e3040 driverUUID = da807cc5-e5c9-2add-5541-8357feabd0cc GPU1: apiVersion = 1.3.230 driverVersion = 0.0.1 vendorID = 0x10005 deviceID = 0x0000 deviceType = PHYSICAL_DEVICE_TYPE_CPU deviceName = llvmpipe (LLVM 15.0.6, 256 bits) driverID = DRIVER_ID_MESA_LLVMPIPE driverName = llvmpipe driverInfo = Mesa 22.3.6 (LLVM 15.0.6) conformanceVersion = 1.3.1.1 deviceUUID = 6d657361-3232-2e33-2e36-000000000000 driverUUID = 6c6c766d-7069-7065-5555-494400000000
That is in fact quite old. Please try upgrading the Mesa driver, for example with https://launchpad.net/~kisak/+archive/ubuntu/turtle
I just made a test with an updated version. The result is unchanged.
Details about the new version:
========== VULKANINFO
Vulkan Instance Version: 1.4.309
Instance Extensions: count = 24
VK_EXT_acquire_drm_display : extension revision 1 VK_EXT_acquire_xlib_display : extension revision 1 VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_direct_mode_display : extension revision 1 VK_EXT_display_surface_counter : extension revision 1 VK_EXT_headless_surface : extension revision 1 VK_EXT_surface_maintenance1 : extension revision 1 VK_EXT_swapchain_colorspace : extension revision 5 VK_KHR_device_group_creation : extension revision 1 VK_KHR_display : extension revision 23 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_display_properties2 : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_KHR_xcb_surface : extension revision 6 VK_KHR_xlib_surface : extension revision 6 VK_LUNARG_direct_driver_loading : extension revision 1
Instance Layers: count = 3
VK_LAYER_INTEL_nullhw INTEL NULL HW 1.1.73 version 1 VK_LAYER_MESA_device_select Linux device selection layer 1.4.303 version 1 VK_LAYER_MESA_overlay Mesa Overlay layer 1.4.303 version 1
Devices:
GPU0: apiVersion = 1.4.305 driverVersion = 25.0.7 vendorID = 0x8086 deviceID = 0x9a49 deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = Intel(R) Iris(R) Xe Graphics (TGL GT2) driverID = DRIVER_ID_INTEL_OPEN_SOURCE_MESA driverName = Intel open-source Mesa driver driverInfo = Mesa 25.0.7-2 conformanceVersion = 1.4.0.0 deviceUUID = 8680499a-0100-0000-0002-000000000000 driverUUID = 380ec810-5d64-cbbd-0034-d8c5a796bd6c GPU1: apiVersion = 1.4.305 driverVersion = 0.0.1 vendorID = 0x10005 deviceID = 0x0000 deviceType = PHYSICAL_DEVICE_TYPE_CPU deviceName = llvmpipe (LLVM 19.1.7, 256 bits) driverID = DRIVER_ID_MESA_LLVMPIPE driverName = llvmpipe driverInfo = Mesa 25.0.7-2 (LLVM 19.1.7) conformanceVersion = 1.3.1.1 deviceUUID = 6d657361-3235-2e30-2e37-2d3200000000 driverUUID = 6c6c766d-7069-7065-5555-494400000000
Thank you for testing that. Can you check if #17108 resolves it? If not, it's most likely the same Intel problem as discussed in #17106.
Setting GGML_VK_DISABLE_F16 and the problem is gone.
version 7003 does not solve the problem.
can reproduce, though instead of garbage I just get spammed unused tokens (if fully on GPU, then unused24, otherwise random) problem goes away at the expense of performance with -ngl 0. Also the magic env-var provided by eiffel31 works perfectly.
I just tested version 7108, the wrong behavior is still there.
Yeah, we didn't do anything about that yet. I can take another look, but I didn't find consistent behaviour, last time. It may be a driver bug.