text-embeddings-inference C API/ C Wrapper API

C API/ C Wrapper API

Open 0110G opened this issue 6 months ago • 0 comments

Feature request

Hi Team. While dockerized container is really helpful for general use cases, for use cases requiring low latencies it adds unnecessary io and network time. It would be really helpful if you expose C API (Perhaps as a wrapper over rust) and provide corresponding dylibs as done by onnx, pytorch, lightgbm etc.

Motivation

My usecase involves getting text embeddings using limited cpu (no gpu) in lowest possible time. Any network based solution (even if it is hosted on same machine) is causing:

Extensive IO
Lack of control over inference service.

Your contribution

I have limited understanding of Rust, so at present, no.

Aug 06 '24 09:08 0110G

text-embeddings-inference text-embeddings-inference copied to clipboard

C API/ C Wrapper API

Feature request

Motivation

Your contribution

text-embeddings-inference
text-embeddings-inference copied to clipboard