infinity
infinity copied to clipboard
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
## Related Issue ## Checklist - [ ] I have read the [CONTRIBUTING](https://github.com/michaelfeil/infinity/tree/main?tab=readme-ov-file#contribute-and-develop) guidelines. - [ ] I have added tests to cover my changes. - [ ] I have...
Automated changes by [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action
### System Info image: michaelf34/infinity:0.0.76 configMap: - HF_HOME=/mnt/llm-models - INFINITY_LOG_LEVEL=debug - INFINITY_ANONYMOUS_USAGE_STATS=0 - INFINITY_MODEL_ID="vidore/colpali-v1.2-merged" - INFINITY_BATCH_SIZE=64 - INFINITY_PORT=7997 - INFINITY_DEVICE=cpu command: - infinity_emb - v2 ### Information - [x] Docker...
## Related Issue ## Checklist - [x] I have read the [CONTRIBUTING](https://github.com/michaelfeil/infinity/tree/main?tab=readme-ov-file#contribute-and-develop) guidelines. - [ ] I have added tests to cover my changes. - [ ] I have updated...
## Description Hi there! First off – thank you for `infinity`, it's a fantastic project that allows setting up OpenAI-compatible API quickly. I was integrating it with Typesense, which recently...
### Feature request `Linq-AI-Research/Linq-Embed-Mistral` is the top opensource model on the https://huggingface.co/spaces/mteb/leaderboard ### Motivation https://huggingface.co/Linq-AI-Research/Linq-Embed-Mistral ### Your contribution I am try run docker with it, not ok yet
### Feature request How can we use embedding models trained under setFit for classification using infinity. ### Motivation https://github.com/huggingface/setfit is a good library that allows fine tuning of embedding models...
### System Info Testing https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base ``` INFO 2025-02-11 20:36:37,724 infinity_emb INFO: select_model.py:64 model=`Alibaba-NLP/gte-reranker-modernbert-base` selected, using engine=`torch` and device=`cuda` You are attempting to use Flash Attention 2.0 with a model not...
### System Info 0.0.74 ### Information - [X] Docker + cli - [ ] pip + cli - [ ] pip + usage of Python interface ### Tasks - [X]...
Hello, first of all, nice work! I have been trying to understand the shapes of the colbert models. as far as I have seen the _**colbert-ir/colbertv2.0**_ has a dimension of...