text-embeddings-inference icon indicating copy to clipboard operation
text-embeddings-inference copied to clipboard

C API/ C Wrapper API

Open 0110G opened this issue 6 months ago • 0 comments

Feature request

Hi Team. While dockerized container is really helpful for general use cases, for use cases requiring low latencies it adds unnecessary io and network time. It would be really helpful if you expose C API (Perhaps as a wrapper over rust) and provide corresponding dylibs as done by onnx, pytorch, lightgbm etc.

Motivation

My usecase involves getting text embeddings using limited cpu (no gpu) in lowest possible time. Any network based solution (even if it is hosted on same machine) is causing:

  1. Extensive IO
  2. Lack of control over inference service.

Your contribution

I have limited understanding of Rust, so at present, no.

0110G avatar Aug 06 '24 09:08 0110G