llama.cpp Fixed WSL cuda's OOM error

Fixed WSL cuda's OOM error

Open JoelSeniorLiang opened this issue 1 year ago • 1 comments

In WSL, the CUDA has limitation on the pinned memory. It's prohibited to allocate a large size of CUDA memory. But it can be bypassed in the program.

May 25 '23 15:05 JoelSeniorLiang

I just opened a large PR for multi GPU support: https://github.com/ggerganov/llama.cpp/pull/1607

May 27 '23 09:05 JohannesGaessler

This is good, this was the intended behavior when the host malloc fails, and not clearing the error was an oversight.

Jun 11 '23 13:06 slaren