ytliew82
Results
2
comments of
ytliew82
tested with Gemma-3 4B, 12B, having same error on not fit into device buffer. currently run with cpu only inference as workaround, and limiting the -ngl argument to fit into...
I am facing the similar issue while running ollama with deepseek-coder-v2 16b and olmoe 7b, both are mixture-of-experts (MoE) code language model, The number of work-items in each dimension of...