[BUG]: App crashes with CUDA error in ggml-cuda.cu:1503
Description
CUDA error: operation failed due to a previous error during capture current device: 0, in function ggml_cuda_op_mul_mat at D:\a\LLamaSharp\LLamaSharp\ggml\src\ggml-cuda.cu:1503 cudaGetLastError() GGML_ASSERT: D:\a\LLamaSharp\LLamaSharp\ggml\src\ggml-cuda.cu:101: !"CUDA error"
From Event Viewer:
<Data Name="ExceptionCode">c0000409</Data> <Data Name="FaultingOffset">000000000007f6fe</Data>
Reproduction Steps
I'm not sure what causes it, I simply run https://github.com/sangyuxiaowu/LLamaWorker with Meta-Llama-3-8B-Instruct-Q8_0.gguf and after some requests the app crashes with the above error.
Environment & Configuration
- Operating system: Windows 11 23H2
- .NET runtime version: 8.0
- LLamaSharp version: 0.14
- CUDA version (if you are using cuda backend): 12.5.1
- CPU & GPU device: Ryzen 5800 + RTX 3090
Known Workarounds
No response
This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.