Sam Stoelinga comments

Results 223 comments of


                                            Sam Stoelinga

issue with parsing model from json when using multiple / in the path

I can reproduce with a simple curl command as well: ``` curl -v http://localhost:8000/openai//v1/completions \ 130 ↵ -H "Content-Type: application/json" \ -d '{"model": "qwen2-500m-cpu", "prompt": "Who was the first president...

issue with parsing model from json when using multiple / in the path

`-L` with curl doesn't work either and returns the 400 error. I will rewrite the integration test to use `-L` as well. Good catch all! Full output: ``` curl -v...

issue with parsing model from json when using multiple / in the path

The root cause seems to be a POST gets auto redirected to GET when using 301: https://datatracker.ietf.org/doc/html/rfc7231#section-6.4.2 We should use a HTTP 307 or 308 to keep it as a...

issue with parsing model from json when using multiple / in the path

It's an issue with curl as well. The weird thing is that my integration test doesn't reproduce it, but I can very much reproduce the 400 error in my local...

issue with parsing model from json when using multiple / in the path

I was able to reproduce in automated testing as well: https://github.com/substratusai/kubeai/actions/runs/11127386385/job/30919572183?pr=259#step:6:593

Ability to provide chat templates to vLLM

In the past I only had to use custom chat templates for specific models.

prepend v1 to OpenAI compatible APIs

Seems the response isn't exactly following the OpenAI response. This is from OpenAI docs: ![image](https://github.com/user-attachments/assets/607def22-7262-4411-b079-7fc2ed12ad69) And this is what Infinity returns: ``` { "object": "embedding", "data": [ { "object": "embedding",...