infinity
infinity copied to clipboard
OpenAI API compatibility
Currently the endpoint for embedding models is /embeddings on the infinity server, but I believe the correct open AI format would be /v1/embeddings ? correct me if Im wrong.
Currently the endpoint for embedding models is /embeddings on the infinity server, but I believe the correct open AI format would be /v1/embeddings ? correct me if Im wrong.
--url-prefix /v1
Some people will run it in a canary deployment or behind a load balancer for the /v1 prefix. If you run Infinity outside of k8s, just use --url-prefix /v1. You can see all options available in --help command.