Adapt the response_format closer to OpenAIs format
Feature request
During testing for https://github.com/BerriAI/litellm/pull/7747 , I found that the differences between OpenAI and TGI are more fundamental than just the optionality of the provided schema. The response_format options themselves are quite different between OpenAI and TGI, despite aiming for similar functionality.
To clarify, here's a breakdown of the response formats:
OpenAI offers these response_format types:
-
text: Plain text response. -
json_object: Free-form JSON object. -
json_schema: JSON response validated against a schema.For example,
json_schemain OpenAI includes a nestedjson_schemaattribute:response_format: { "type": "json_schema", "json_schema": { "name": "some_name", "strict": true, "schema": ... // the actual json schema } }
TGI currently provides these options:
-
regex: Response matching a regular expression. -
json,json_object: These are the same and are treated similar to the grammar parameter. They both require avaluefield to contain the JSON schema.For example, TGI's
json_object(andjson) format requires avaluefield:response_format: { "type": "json_object", "value": ... // the actual json schema }
The key difference and point of incompatibility is:
TGI does not directly offer a response_type of json_schema like OpenAI. Instead, it uses json_object (or json) with a mandatory value field for schema definition. This structural difference causes incompatibility when trying to use OpenAI-style response_format requests with TGI.
Suggested Improvement:
To enhance compatibility with the OpenAI API, TGI should ideally add support for the json_schema response_type, mirroring OpenAI's structure. Implementing this would likely involve a relatively minor modification within TGI's router logic to recognize and handle the json_schema type in a manner consistent with OpenAI.
From a user perspective, having json_schema support would make working with both OpenAI and TGI much smoother. It seems like a worthwhile compatibility improvement on its own. If XGrammar is also on the roadmap, I wonder if there might be efficiencies in addressing both areas at the same time?
Motivation
This addition would significantly improve interoperability and reduce the need for users to adapt their code when switching between OpenAI and TGIs APIs.
Your contribution
While I am not an expert in Rust, I am willing to contribute to this enhancement by attempting to help with a Pull Request.