Diego Devesa
Diego Devesa
The problem is that some backends cannot use dummy buffers, they need a buffer object in addition to an offset. For these tensors, the dummy buffer will never be useful....
I have already explained how I intend to implement support for external tensors. ggml-backend is a work in progress and frankly, it is quite annoying to be ignored about changes...
Probably in the way that I suggested earlier, but nothing is set in stone. I will get back to this while implementing support for mmap for llama.cpp. As for the...
Do you get any errors with `compute-sanitizer`? Run `compute-sanitizer ./main -m ..`
`-ngl -1` is effectively the same as `-ngl 0`.
What driver version are you using? Run `nvidia-smi`.
It should work with any value. You could try running the `eval-callback` example with CPU and with full offload and try to see what's the first operation that produces significantly...
Sorry, eval-callback was broken and the numbers are useless. Please try again with #7184 or after it is merged.
Can you share the full command line that you used to generate the eval-callback logs? What f16 model did you use? It seems to break down at the end of...
My guess is that this is a hardware failure of some sort. Are you using a custom build these V100 that might not provide enough power or cooling?