[Feature Request]: is there a plan to support batch processing in OpenCLIPEmbeddingFunction?

Open gyula-coder opened this issue 1 year ago • 0 comments

Describe the problem

I want to extract embeddings for one hundred thousand images by OpenCLIPEmbeddingFunction. But I found the images can only be encoded one by one because the code below:

def __call__(self, input: Union[Documents, Images]) -> Embeddings:
    embeddings: Embeddings = []
    for item in input:
        if is_image(item):
            embeddings.append(self._encode_image(cast(Image, item)))
        elif is_document(item):
            embeddings.append(self._encode_text(cast(Document, item)))
    return embeddings

so Is there a plan to support batch processing in OpenCLIPEmbeddingFunction?

Describe the proposed solution

Batch processing is used by default when creating a collection

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

Jul 24 '24 03:07 gyula-coder