infinity issues

support for siglip 2 models

8

### Model description Hi, many thanks for your great work bringing infinity-emb into life which solved a ton of problems I had plus a lot of time! However, I tried...

JosefAschauer

Docs on disabling vector disk cache on high throughput scenario

## Request I am suggesting/requesting a section documenting that disabling vector disk cache might bring significant performance boost, if the throughput is particularly high (if I am correct). ## Context...

HankelBao

High memory usage while loading onnx model with optimum engine

1

### System Info When running the infinity cpu docker image with optimum engine with an onnx image, the memory usage goes up very high temporeraly. For example with the model...

molntamas

Warmup fails for embedding model

1

### System Info Command: docker compose up OS Version: linux, ubuntu Model: intfloat/multilingual-e5-large-instruct docker compose file: ``` services: infinity: image: michaelf34/infinity:latest-cpu command: - v2 - --engine - optimum - --model-id...

molntamas

Support embedding with "instructions" Again

5

### Feature request Hello, First of all, thank you for developing infinity, an excellent package dedicated to inference for embedding models. I am opening this issue to request support once...

DataLama

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel

1

## Description This is a PR that integrates OpenVINO backend into Infinity's Optimum Embedder class through the use of [optimum-intel](https://github.com/huggingface/optimum-intel/tree/main) library. ## Related Issue If applicable, link the issue this...

tjtanaa

How to set customize max_length via engineArgs?

rangehow

Scaling improvement for CPU-bound embedding tasks

Hi, in my setup I am embedding images in bulk (1000 images/request) with 1 T4 and 40 CPUs on Modal. With the normal embedding call embedding 1000 images takes **55s**...

krisztian-gajdar

Support for v2 version of Mixedbread Reranker Base

4

### Model description Hi there :) I have been using the `mixedbread-ai/mxbai-rerank-base-v1` served with Infinity via Runpod for some time now. However, mixedbread has released a v2 version: https://huggingface.co/mixedbread-ai/mxbai-rerank-base-v2, for...

axeloh

How to limit memory usage?

1

When I run ```bash port=3000 model1=Salesforce/SFR-Embedding-Code-2B_R volume=$PWD/data docker run -it --gpus device=0 \ -v $volume:/app/.cache \ -p $port:$port \ michaelf34/infinity:latest \ v2 \ --model-id $model1 \ --port $port \ --model-warmup...

thaiminhpv

infinity
infinity copied to clipboard

Metadata

support for siglip 2 models

Docs on disabling vector disk cache on high throughput scenario

High memory usage while loading onnx model with optimum engine

Warmup fails for embedding model

Support embedding with "instructions" Again

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel

How to set customize max_length via engineArgs?

Scaling improvement for CPU-bound embedding tasks

Support for v2 version of Mixedbread Reranker Base

How to limit memory usage?

← Metadata

Owner

Metadata

infinity infinity copied to clipboard

Metadata

← Metadata

Owner

Metadata

infinity
infinity copied to clipboard