rwkv.cpp icon indicating copy to clipboard operation
rwkv.cpp copied to clipboard

rwkv_clone_context thread-safety when using cuBLAS

Open eduardsui opened this issue 1 year ago • 0 comments

Hello,

I'm trying to use rwkv.cpp in two different threads. For this, I'm loading the model and then using two context clones (via rwkv_clone_context). Everything works fine when each thread runs rwkv_eval, but when running simultaneously in two threads, I get an error:

GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)
GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)

It seems that alloc/free are called "out of order" for the two contexts. Any idea how to solve this?

Thanks!

eduardsui avatar Sep 14 '24 08:09 eduardsui