text-embeddings-inference
text-embeddings-inference copied to clipboard
C API/ C Wrapper API
Feature request
Hi Team. While dockerized container is really helpful for general use cases, for use cases requiring low latencies it adds unnecessary io and network time. It would be really helpful if you expose C API (Perhaps as a wrapper over rust) and provide corresponding dylibs as done by onnx, pytorch, lightgbm etc.
Motivation
My usecase involves getting text embeddings using limited cpu (no gpu) in lowest possible time. Any network based solution (even if it is hosted on same machine) is causing:
- Extensive IO
- Lack of control over inference service.
Your contribution
I have limited understanding of Rust, so at present, no.