ggml Plugin interface for backends

Plugin interface for backends

Open iboB opened this issue 8 months ago • 22 comments

This implements a proposal to add a plugin interface to backends.

There is a demo project which uses the changes from this PR here: https://github.com/iboB/pytorch-ggml-plugin

Notable changes

add set_tensor_external_data to the backend interface

This is as simple as tensor->data = external_data_ptr on cpu and metal, but requires more steps on cuda and potentially other backends (creating tensor extra data)

add ggml_backend_cuda_init_plugin

This allows us to initialize the cuda backend with extrenal device id, cublas instance, and cuda stream

add GGML_PLUGIN config to CMakeLists.txt

This is explicitly not a CMake option it should be set to ON from the outside when one calls add_subdirectory to ggml for a plugin. It forces a static library build and adds -fPIC.

Questions

Tensors whose pointer is set externally with set_tensor_external_data have no buffer. Thus ggml_get_backend and associated ops tensor_get/set will simply crash when called on them.

We could leave it as it is documenting that these ops are not supported for external tensors and they have null buffers, or we could add a dummy buffer with the correct backend to them (which may be tricky if one wants to allocate memory with it).

What about something else?

Oct 11 '23 06:10 iboB

ggml ggml copied to clipboard

Plugin interface for backends

Notable changes

Questions

ggml
ggml copied to clipboard