llama.cpp
llama.cpp copied to clipboard
[User] How to specify which cuda device to use programmably
Say I have four Nvidia cards and I want to run four models on each of the card in one program. The SDK doesn't provide parameter to specify which cuda device to run the model on?
Fixed by https://github.com/ggerganov/llama.cpp/pull/1607 .
Looks awesome!
This issue was closed because it has been inactive for 14 days since being marked as stale.