infinity
infinity copied to clipboard
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
### System Info RunPod template ### Information - [ ] Docker + cli - [ ] pip + cli - [ ] pip + usage of Python interface ### Tasks...
## Related Issue ## Checklist - [ ] I have read the [CONTRIBUTING](https://github.com/michaelfeil/infinity/tree/main?tab=readme-ov-file#contribute-and-develop) guidelines. - [ ] I have added tests to cover my changes. - [ ] I have...
### Feature request There have been discussions on having decent performance in using colbert style models as rerankers (e.g. https://www.answer.ai/posts/2024-09-16-rerankers.html), and it would be useful if the rerank endpoint can...
### System Info infinity onnx image latest ### Information - [ ] Docker + cli - [ ] pip + cli - [ ] pip + usage of Python interface...
### Feature request Models like https://huggingface.co/BAAI/bge-m3 and https://huggingface.co/jinaai/jina-embeddings-v3 can take extras kwargs as input of the `encode` function such as `task=...` for Jina v3 or `return_dense=False/True` for bge-m3 It would...
### Feature request Is there a way of receiving the embeddings back in BQ format? Right now, I receive the full precision embedding and quantize it in the client, but...
### System Info latest any platform ### Information - [ ] Docker + cli - [ ] pip + cli - [ ] pip + usage of Python interface ###...
### Model description I have a custom SentenceTransformer model that is a custom class (And also quite nested), so on the top level the modules.json file look like ``` [...
@tjtanaa FYI, continued by merging your branch into this and main.
There is a need to add a contribution.md file. So that anyone who wants to contribute have an idea of what steps to follow for contribution. I want to work...