whisper.cpp
whisper.cpp copied to clipboard
ggml_backend_cuda_buffer_types[].context allocated but not freed
trafficstars
In ggml_backend_cuda_buffer_type function you have static array ggml_backend_cuda_buffer_types[GGML_CUDA_MAX_DEVICES] which you fill with allocated memory in "context" field which is not freed.