add json generation support for openai
add json generation support for openai
Some sample input/output of json generation looks like the following
In [3]: class User(BaseModel):
...: name: str
...: last_name: str
...: id: int
...:
In [4]: generator = generate.json(model, User)
...: result = generator(
...: "Create a user profile with the fields name, last_name and id"
...: )
In [5]: result
Out[5]: User(name='John', last_name='Doe', id=12345)
I think this would result in a lot of user complaints about validation errors. Unlike local models which outlines supports, OpenAI doesn't guarantee the output will match the specified schema.
I think this would result in a lot of user complaints about validation errors. Unlike local models which outlines supports, OpenAI doesn't guarantee the output will match the specified schema.
I think it's fine as long as we return an informative error which states the problem is on OpenAI's side.
I was thinking we should use OpenAI's function calling for this feature.
@lapp0 @rlouf Thanks for the feedback. We can change to use function calling (tool_choice) But I have a question about OpenAI, as you can see from the document
response_format
object
Optional
An object specifying the format that the model must output. Compatible with [GPT-4 Turbo](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo) and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.
Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
if we set response_format to 'json_object', the response should be a valid JSON string, did you encountered the API returns not valid json string?
Yes it happens, but we can return a meaningful error when this happens.
@rlouf Thanks for the comments. I will change it to function calling, and raise necessary warning when received non valid json response.
@rlouf To use function calling, we may need to add tools and tool_choice to OpenAIConfig Please confirm if it's appropriate, thank you
@rlouf I have changed to function calling, would you please help to review? thank you
Hi! In light of the recent OpenAI Structured Outputs support, could this PR be changed as a "pass-through" solution to use OpenAI's APIs?
@acatovic Absolutely YES. But I think we should consider compatibility issues. According to the source codes listed https://github.com/openai/openai-python/blob/main/src/openai/lib/_tools.py#L39. The new native SDK support for structured output is using function calling internally, it's just a wrapper of function calling, and that feature only exists on openai version greater or equal to v1.40.0. I am afraid most of the users are not using these versions.
@rlouf Would you please give some suggestions about what should we do next? thank you.
Let's use OpenAI's structured output support, which should be easy! We can always raise an exception if the version of the SDK that is installed is < 1.40 and invite users to upgrade to the latest version.
@rlouf Bad news. We encountered the following erros
File /usr/local/lib/python3.12/dataclasses.py:1320, in asdict(obj, dict_factory)
1318 if not _is_dataclass_instance(obj):
1319 raise TypeError("asdict() should be called on dataclass instances")
-> 1320 return _asdict_inner(obj, dict_factory)
File /usr/local/lib/python3.12/dataclasses.py:1330, in _asdict_inner(obj, dict_factory)
1326 elif _is_dataclass_instance(obj):
1327 # fast path for the common case
1328 if dict_factory is dict:
1329 return {
-> 1330 f.name: _asdict_inner(getattr(obj, f.name), dict)
1331 for f in fields(obj)
1332 }
1333 else:
1334 result = []
File /usr/local/lib/python3.12/dataclasses.py:1364, in _asdict_inner(obj, dict_factory)
1359 return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
1360 elif isinstance(obj, (list, tuple)):
1361 # Assume we can create an object of this type by passing in a
1362 # generator (which is not true for namedtuples, handled
1363 # above).
-> 1364 return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
1365 elif isinstance(obj, dict):
1366 if hasattr(type(obj), 'default_factory'):
1367 # obj is a defaultdict, which has a different constructor from
1368 # dict as it requires the default_factory as its first arg.
File /usr/local/lib/python3.12/dataclasses.py:1364, in <genexpr>(.0)
1359 return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
1360 elif isinstance(obj, (list, tuple)):
1361 # Assume we can create an object of this type by passing in a
1362 # generator (which is not true for namedtuples, handled
1363 # above).
-> 1364 return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
1365 elif isinstance(obj, dict):
1366 if hasattr(type(obj), 'default_factory'):
1367 # obj is a defaultdict, which has a different constructor from
1368 # dict as it requires the default_factory as its first arg.
File /usr/local/lib/python3.12/dataclasses.py:1373, in _asdict_inner(obj, dict_factory)
1371 result[_asdict_inner(k, dict_factory)] = _asdict_inner(v, dict_factory)
1372 return result
-> 1373 return type(obj)((_asdict_inner(k, dict_factory),
1374 _asdict_inner(v, dict_factory))
1375 for k, v in obj.items())
1376 else:
1377 return copy.deepcopy(obj)
File /usr/local/lib/python3.12/dataclasses.py:1374, in <genexpr>(.0)
1371 result[_asdict_inner(k, dict_factory)] = _asdict_inner(v, dict_factory)
1372 return result
1373 return type(obj)((_asdict_inner(k, dict_factory),
-> 1374 _asdict_inner(v, dict_factory))
1375 for k, v in obj.items())
1376 else:
1377 return copy.deepcopy(obj)
File /usr/local/lib/python3.12/dataclasses.py:1373, in _asdict_inner(obj, dict_factory)
1371 result[_asdict_inner(k, dict_factory)] = _asdict_inner(v, dict_factory)
1372 return result
-> 1373 return type(obj)((_asdict_inner(k, dict_factory),
1374 _asdict_inner(v, dict_factory))
1375 for k, v in obj.items())
1376 else:
1377 return copy.deepcopy(obj)
TypeError: PydanticFunctionTool.__init__() missing 1 required positional argument: 'model'
and
File ~/test/outlines/outlines/caching.py:104, in cache.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
101 return await cached_function(*args, **kwargs)
103 cache_key = wrapper.__cache_key__(*args, **kwargs)
--> 104 result = wrapper.__memory__.get(cache_key, default=ENOVAL, retry=True)
106 if result is ENOVAL:
107 result = await cached_function(*args, **kwargs)
File /usr/local/lib/python3.12/site-packages/diskcache/core.py:1149, in Cache.get(self, key, default, read, expire_time, tag, retry)
1123 def get(
1124 self,
1125 key,
(...)
1130 retry=False,
1131 ):
1132 """Retrieve value from cache. If `key` is missing, return `default`.
1133
1134 Raises :exc:`Timeout` error when database timeout occurs and `retry` is
(...)
1147
1148 """
-> 1149 db_key, raw = self._disk.put(key)
1150 update_column = EVICTION_POLICY[self.eviction_policy]['get']
1151 select = (
1152 'SELECT rowid, expire_time, tag, mode, filename, value'
1153 ' FROM Cache WHERE key = ? AND raw = ?'
1154 ' AND (expire_time IS NULL OR expire_time > ?)'
1155 )
File ~/test/outlines/outlines/caching.py:20, in CloudpickleDisk.put(self, key)
19 def put(self, key):
---> 20 data = cloudpickle.dumps(key)
21 return super().put(data)
File /usr/local/lib/python3.12/site-packages/cloudpickle/cloudpickle.py:1479, in dumps(obj, protocol, buffer_callback)
1477 with io.BytesIO() as file:
1478 cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1479 cp.dump(obj)
1480 return file.getvalue()
File /usr/local/lib/python3.12/site-packages/cloudpickle/cloudpickle.py:1245, in Pickler.dump(self, obj)
1243 def dump(self, obj):
1244 try:
-> 1245 return super().dump(obj)
1246 except RuntimeError as e:
1247 if len(e.args) > 0 and "recursion" in e.args[0]:
TypeError: cannot pickle 'sqlite3.Connection' object
It seems like there are some conflicts between PydanticFunctionTool and outlines.base.vectorize and from outlines.caching.cache
BTW, the native structured output feature only exists in beta module like client.beta.chat.completions.parse
Seems like the error "cannot pickle 'sqlite3.Connection' object" relates to https://github.com/pydantic/pydantic/issues/8232
Since OpenAI uses JSON Schema under the hood we can always directly use the to_strict_json_schema function to convert the Pydantic model in a JSON Schema that is compatible with the OpenAI API, send this schema to the API and parse the response. That's a little more "manual" but not overwhelmingly so.
@rlouf We can get the json schema using functionto_strict_json_schema but the problem is we can not use the new method as client.beta.chat.completions.parse. as you can see from the following exceptions There are some conflicts to be solved between openai's native structured output and our codes for now.
File /usr/local/lib/python3.12/site-packages/openai/resources/beta/chat/completions.py:330, in AsyncCompletions.parse(self, messages, model, response_format, frequency_penalty, function_call, functions, logit_bias, logprobs, max_tokens, n, parallel_tool_calls, presence_penalty, seed, service_tier, stop, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
320 _validate_input_tools(tools)
322 extra_headers = {
323 "X-Stainless-Helper-Method": "beta.chat.completions.parse",
324 **(extra_headers or {}),
325 }
327 raw_completion = await self._client.chat.completions.create(
328 messages=messages,
329 model=model,
--> 330 response_format=_type_to_response_format(response_format),
331 frequency_penalty=frequency_penalty,
332 function_call=function_call,
333 functions=functions,
334 logit_bias=logit_bias,
335 logprobs=logprobs,
336 max_tokens=max_tokens,
337 n=n,
338 parallel_tool_calls=parallel_tool_calls,
339 presence_penalty=presence_penalty,
340 seed=seed,
341 service_tier=service_tier,
342 stop=stop,
343 stream_options=stream_options,
344 temperature=temperature,
345 tool_choice=tool_choice,
346 tools=tools,
347 top_logprobs=top_logprobs,
348 top_p=top_p,
349 user=user,
350 extra_headers=extra_headers,
351 extra_query=extra_query,
352 extra_body=extra_body,
353 timeout=timeout,
354 )
355 return _parse_chat_completion(
356 response_format=response_format,
357 chat_completion=raw_completion,
358 input_tools=tools,
359 )
File /usr/local/lib/python3.12/site-packages/openai/lib/_parsing/_completions.py:244, in type_to_response_format_param(response_format)
239 # type checkers don't narrow the negation of a `TypeGuard` as it isn't
240 # a safe default behaviour but we know that at this point the `response_format`
241 # can only be a `type`
242 response_format = cast(type, response_format)
--> 244 if not is_basemodel_type(response_format):
245 raise TypeError(f"Unsupported response_format type - {response_format}")
247 return {
248 "type": "json_schema",
249 "json_schema": {
(...)
253 },
254 }
File /usr/local/lib/python3.12/site-packages/openai/lib/_parsing/_completions.py:220, in is_basemodel_type(typ)
219 def is_basemodel_type(typ: type) -> TypeGuard[type[pydantic.BaseModel]]:
--> 220 return issubclass(typ, pydantic.BaseModel)
File <frozen abc>:123, in __subclasscheck__(cls, subclass)
TypeError: issubclass() arg 1 must be a class
Seems like the problem exists in the following function and the default None response_format in our codes
https://github.com/openai/openai-python/blob/b143c164678ad7579161448846ce67be62e7f21f/src/openai/lib/_parsing/_completions.py#L230
@rlouf We can use the following logic so bypass the conflicts and get the expected result
import openai
if not hasattr(openai, "pydantic_function_tool"):
raise NotImplementedError(
"The OpenAI library does not support Native Structured Outputs, please upgrade to the latest version."
)
# construct tools to be used in the API call to bypass the conflicts between cloudpickle and IPython
tools = [
{
"type": "function",
"function": {
"name": model.__name__,
"strict": True,
"parameters": openai.lib._pydantic.to_strict_json_schema(model),
},
}
]
response_format = openai.NOT_GIVEN
config = replace(
self.config, max_tokens=max_tokens, tools=tools, response_format=response_format
)
response, prompt_tokens, completion_tokens = generate_structured_output(
prompt, system_prompt, self.client, config
)
# convert the response to pydantic object, and should be removed when the bug of pydantic working with cloudpickle and iPython is solved
response = openai._compat.model_parse_json(model, response)
and the output looks like
In [1]: from pydantic import BaseModel
...: import outlines
...: from outlines import models, generate
...:
...: class User(BaseModel):
...: name: str
...: last_name: str
...: id: int
...:
In [2]: model = models.openai('gpt-4o-mini',
...: api_key="sk-proj-xxxxxxxxxxxxxxxxxxxxx",
...: )
In [3]: generator = generate.json(model, User)
...: result = generator(
...: "Create a user profile with the fields name, last_name and id"
...: )
In [4]: result
Out[4]: User(name='John', last_name='Doe', id=1)
But I don't think this is an elegant solution. and we need to refactor it later (once the bugs in the dependencies are solved) @rlouf Would you please help to give some comments? Do we need to use this solution for a temporary solution and refactor it later? thank you