text-embeddings-inference
text-embeddings-inference copied to clipboard
A blazing fast inference solution for text embeddings models
### Model description jina-embeddings-v2-base-code is an multilingual embedding model speaks English and 30 widely used programming languages. Same as other jina-embeddings-v2 series, it supports 8192 sequence length. jina-embeddings-v2-base-code is based...
### System Info While starting using docker as below I get error ``` docker run --gpus all -p 8912:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model ``` I can run...
### System Info Thanks a lot for contributing such a great embedding framework. However, I've encountered a problem in using it and would like to ask for help! I set...
### System Info Tested TEI versions: - v1.2.0 (official Docker) - v1.2.3 (official Docker) - [cc1c510](https://github.com/huggingface/text-embeddings-inference/commit/cc1c510e8d8af8447c01e6b14c417473cf2dfda9) (current main, built on Ubuntu 23.10, cargo 1.75.0) As it already fails during model...
This is necessary in order to load models whose tokenizers have been created by a version after the breaking change https://github.com/huggingface/tokenizers/pull/1476 (i.e. >= v0.19.0) Fixes #265 ## Before submitting -...
### Model description >We introduce gte-v1.5 series, upgraded gte embeddings that support the context length of up to 8192, while further enhancing model performance. The models are built upon the...
Updates the name of the token in the README from `HUGGING_FACE_HUB_TOKEN` to `HF_API_TOKEN` to avoid confusion. The former is not used.
### Model description [Improving Text Embeddings with Large Language Models](https://arxiv.org/pdf/2401.00368.pdf). Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei, arXiv 2024 This model has 32 layers and...
### Model description Here is the model description > gte-Qwen1.5-7B-instruct is the latest addition to the gte embedding family. This model has been engineered starting from the [Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B) LLM, drawing...
This PR adds option value '*' to --cors-allow-origin cli option to allow browser-based apps to use the embedding server directly. This is useful for local deployments of the embeddings inference...