drbh

Results 69 comments of drbh

hi @Hojun-Son I just ran the same command and was able to start a server. It may be a latent networking issue with downloading the model. Also please make sure...

This PR correctly adds the response type `json_schema` as a `{"response_format": {"type: "..."`, requires a value of the expected schema. This change emphasizes an existing issue with `json_object` which currently...

this PR has been updated to handle `response_format.type == "json_schema"` formats as shown in the example below. ```python # model id: meta-llama/Meta-Llama-3.1-8B-Instruct import requests import json # simple person JSON...

optimistically merging this PR as all tests pass, comments have been addressed, this image has been test/deployed in production and it fixes a bug when starting TGI with qwen2-vl. Will...

Hi @jondot this appears to be an issue with the `async-openai`. The error thrown is [JSONDeserialize](https://github.com/64bit/async-openai/blob/6d70a33cd2ccbb011b8a920826eb114325f3703c/async-openai/src/error.rs#L14) which is defined in the client library. Additionally for reference, I just tested the...

thanks for the quick response @jondot, I apologize but I'm not sure I fully understand the issue/can reproduce. Would you be able to share an example of a request you're...

Hi @AHEADer thanks for opening this issue, I just attempted to reproduce on a machine with L4's with a single L4 I was unable to run `Qwen/Qwen2-VL-7B-Instruct` with a 20K...

Hi @AHEADer apologies for my mistake above, I misread the L40S as an L4 🤦‍♂️. Fortunately, I believe this issue has actually been resolved in a recent PR after `v3.0.1`....

Hi @martinigoyanes thank you for reporting this issue, as you noted there is a difference between the existing function calling and llama's newer `` related functions. Currently when `tools` are...

Hi @lhoestq I just took a look at how this may be implemented in TGI and it seems that `outlines` does not support arbitrary JSON so we would need to...