The request to the /v1/embeddings endpoint returned a 404 status code.
Model description
When I loaded the embedding model and tested the request, it returned a 404 status code. Is this because Infinity does not support requests in the form of /v1/embeddings?
curl http://127.0.0.1:8000/v1/embeddings \
-X POST \
-H 'Content-Type: application/json' \
-d '{
"model": "bge-large-zh-v1.5",
"embedding_format": "float",
"input": "What is Deep Learning"
}'
Open source status
- [ ] The model implementation is available on transformers
- [ ] The model weights are available on huggingface-hub
- [ ] I verified that the model is currently not running in the lastest version
pip install infinity_emb[all] --upgrade
Provide useful links for the implementation
No response
The 404 status code indicates that the endpoint /v1/embeddings does not exist. The correct endpoint is /embeddings as defined in the FastAPI server implementation. Update your request URL to http://127.0.0.1:8000/embeddings.
References
/libs/infinity_emb/infinity_emb/engine.py /libs/infinity_emb/infinity_emb/fastapi_schemas/convert.py /libs/infinity_emb/infinity_emb/infinity_server.py /libs/infinity_emb/tests/end_to_end/test_api_with_dummymodel.py /libs/infinity_emb/tests/end_to_end/test_optimum_embedding.py /docs /docs/assets/openapi.json /docs/docs/index.md
About Greptile
This response provides a starting point for your research, not a precise solution.
Help us improve! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
@monkdharma Please use the url-prefix feature for `v2 --url-prefix "v1"
Just verified setting v2 --url-prefix /v1 works as intended.
Used command to start server:
infinity_emb v2 --url-prefix /v1 --port 8000
Result from curl command from above:
curl http://127.0.0.1:8000/v1/embeddings -X POST -H 'Content-Type: application/json' -d '{
"model": "bge-large-zh-v1.5",
"embedding_format": "float",
"input": "What is Deep Learning"
}'
{"object":"embedding","data":[{"object":"embedding","embedding":[-0.005...