text-embeddings-inference
                                
                                 text-embeddings-inference copied to clipboard
                                
                                    text-embeddings-inference copied to clipboard
                            
                            
                            
                        Support tokenized input
Feature request
The OpenAI API /embedding endpoint accepts input for both text (list of strings) and tokenized input (list of integers). text-embeddings-inference should also support list of integers (tokens) as input to create embeddings from.
Motivation
Without this feature, tools like langchain-openai do not work out of the box. The OpenAIEmbeddings object tokenizes text before sending to the embedding model; the following code throws errors:
from langchain_openai import OpenAIEmbeddings
base_url = "127.0.0.1"
api_key = "sk-xxx"
embeddings = CustomOpenAIEmbeddings( # Use the custom class that doesn't tokenize the input
        model="jina-v2-base",
        base_url=base_url,
        openai_api_key=api_key
)
text = "This is a test query."
query_result = embeddings.embed_query(text)
print(query_result)
A workaround is to force langchain to not tokenize, although it would be cleaner if TEI supported tokens as input:
import os
from typing import List, Tuple, Iterable, Union
from langchain_openai import OpenAIEmbeddings
class CustomOpenAIEmbeddings(OpenAIEmbeddings):
    def _tokenize(
            self, texts: List[str], chunk_size: int
        ) -> Tuple[Iterable[int], List[Union[List[int], str]], List[int]]:
        _iter = range(0, len(texts), chunk_size)
        tokens = texts
        indices = list(range(len(texts)))
        return _iter, tokens, indices
base_url = "127.0.0.1"
api_key = "sk-xxx"
embeddings = CustomOpenAIEmbeddings( # Use the custom class that doesn't tokenize the input
        model="jina-v2-base",
        base_url=base_url,
        openai_api_key=api_key
)
text = "This is a test query."
query_result = embeddings.embed_query(text)
print(query_result)
Your contribution
I'm not familiar with rust so I don't think I'm comfortable implementing this, but I can help test and document the new functionality.