ggml
ggml copied to clipboard
Plugin interface for backends
This implements a proposal to add a plugin interface to backends.
There is a demo project which uses the changes from this PR here: https://github.com/iboB/pytorch-ggml-plugin
Notable changes
add set_tensor_external_data
to the backend interface
This is as simple as tensor->data = external_data_ptr
on cpu and metal, but requires more steps on cuda and potentially other backends (creating tensor extra data)
add ggml_backend_cuda_init_plugin
This allows us to initialize the cuda backend with extrenal device id, cublas instance, and cuda stream
add GGML_PLUGIN
config to CMakeLists.txt
This is explicitly not a CMake option
it should be set to ON
from the outside when one calls add_subdirectory
to ggml for a plugin. It forces a static library build and adds -fPIC
.
Questions
Tensors whose pointer is set externally with set_tensor_external_data
have no buffer. Thus ggml_get_backend
and associated ops tensor_get/set
will simply crash when called on them.
We could leave it as it is documenting that these ops are not supported for external tensors and they have null buffers, or we could add a dummy buffer with the correct backend to them (which may be tricky if one wants to allocate memory with it).
What about something else?