infinity icon indicating copy to clipboard operation
infinity copied to clipboard

The request to the /v1/embeddings endpoint returned a 404 status code.

Open monkdharma opened this issue 1 year ago โ€ข 2 comments

Model description

When I loaded the embedding model and tested the request, it returned a 404 status code. Is this because Infinity does not support requests in the form of /v1/embeddings?

curl http://127.0.0.1:8000/v1/embeddings \
    -X POST \
    -H 'Content-Type: application/json' \
   -d '{
  "model": "bge-large-zh-v1.5",
  "embedding_format": "float",
  "input": "What is Deep Learning"
  }'

Open source status

  • [ ] The model implementation is available on transformers
  • [ ] The model weights are available on huggingface-hub
  • [ ] I verified that the model is currently not running in the lastest version pip install infinity_emb[all] --upgrade

Provide useful links for the implementation

No response

monkdharma avatar Jul 15 '24 04:07 monkdharma

The 404 status code indicates that the endpoint /v1/embeddings does not exist. The correct endpoint is /embeddings as defined in the FastAPI server implementation. Update your request URL to http://127.0.0.1:8000/embeddings.

References

/libs/infinity_emb/infinity_emb/engine.py /libs/infinity_emb/infinity_emb/fastapi_schemas/convert.py /libs/infinity_emb/infinity_emb/infinity_server.py /libs/infinity_emb/tests/end_to_end/test_api_with_dummymodel.py /libs/infinity_emb/tests/end_to_end/test_optimum_embedding.py /docs /docs/assets/openapi.json /docs/docs/index.md

About Greptile

This response provides a starting point for your research, not a precise solution.

Help us improve! Please leave a ๐Ÿ‘ if this is helpful and ๐Ÿ‘Ž if it is irrelevant.

Ask Greptile ยท Edit Issue Bot Settings

greptile-apps[bot] avatar Jul 15 '24 04:07 greptile-apps[bot]

@monkdharma Please use the url-prefix feature for `v2 --url-prefix "v1"

michaelfeil avatar Sep 23 '24 17:09 michaelfeil

Just verified setting v2 --url-prefix /v1 works as intended.

Used command to start server:

infinity_emb v2 --url-prefix /v1 --port 8000

Result from curl command from above:

curl http://127.0.0.1:8000/v1/embeddings     -X POST     -H 'Content-Type: application/json'    -d '{
  "model": "bge-large-zh-v1.5",
  "embedding_format": "float",
  "input": "What is Deep Learning"
  }'
{"object":"embedding","data":[{"object":"embedding","embedding":[-0.005...

wirthual avatar Oct 13 '24 18:10 wirthual