Hui Chen

Results 38 comments of Hui Chen

暂时还不支持这个功能,需要pull request

应该是服务关闭(或者crash时)没有调用engine.Close()正常关闭导致的: https://github.com/huichen/wukong/blob/master/engine/engine.go#L371 如果无法正常关闭服务,可以考虑停用持久存储。

如果你发现了bug,请提交pull request

I'm getting following error: `cuBLAS error 14 at ggml-cuda.cu:759` when running `./main -m ~/llms/ggml-vic13b-q5_1.bin -p "hello" -ngl 1` Followed your instruction in building the binaries. 4 x A40-48G cards

> @huichen Can you do another test? The problem may have been caused by me only using one cuBLAS handle instead of one per GPU. It works now, both with...

Some here. I don't think either is normal. Both were working yesterday. Can someone TAL?

> Not sure about the error, but does setting threads to 1 improve your performance? When offloaded to the gpu the cpu is actually blocking more than helping. Threads to...