Diego Devesa comments

Results 361 comments of


                                            Diego Devesa

Plugin interface for backends

The problem is that some backends cannot use dummy buffers, they need a buffer object in addition to an offset. For these tensors, the dummy buffer will never be useful....

Plugin interface for backends

I have already explained how I intend to implement support for external tensors. ggml-backend is a work in progress and frankly, it is quite annoying to be ignored about changes...

Plugin interface for backends

Probably in the way that I suggested earlier, but nothing is set in stone. I will get back to this while implementing support for mmap for llama.cpp. As for the...

Significantly different results (and WRONG) inference when GPU is enabled.

Do you get any errors with `compute-sanitizer`? Run `compute-sanitizer ./main -m ..`

Significantly different results (and WRONG) inference when GPU is enabled.

`-ngl -1` is effectively the same as `-ngl 0`.

Significantly different results (and WRONG) inference when GPU is enabled.

What driver version are you using? Run `nvidia-smi`.

Significantly different results (and WRONG) inference when GPU is enabled.

It should work with any value. You could try running the `eval-callback` example with CPU and with full offload and try to see what's the first operation that produces significantly...

Significantly different results (and WRONG) inference when GPU is enabled.

Sorry, eval-callback was broken and the numbers are useless. Please try again with #7184 or after it is merged.

Significantly different results (and WRONG) inference when GPU is enabled.

Can you share the full command line that you used to generate the eval-callback logs? What f16 model did you use? It seems to break down at the end of...

Significantly different results (and WRONG) inference when GPU is enabled.

My guess is that this is a hardware failure of some sort. Are you using a custom build these V100 that might not provide enough power or cooling?