llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Better error message for fireworks invalid response_format

Open aidando73 opened this issue 10 months ago • 0 comments

🚀 Describe the new functionality needed

For this request:

    response = client.inference.chat_completion(
        model_id=MODEL_ID,
        messages=[
            {"role": "user", "content": "Hello World"},
        ],
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "Plan",
                "description": f"A plan to complete the task of creating a codebase that will {PROGRAM_OBJECTIVE}.",
                "strict": True,
                "schema": {
                    "type": "object",
                    "properties": {
                        "steps": {"type": "array", "items": {"type": "string"}},
                    },
                    "required": ["steps"],
                    "additionalProperties": False,
                },
            },
        },
        tools=TOOLS,
    )

We get a 500 error with no good client message:

Traceback (most recent call last):
  File "/Users/aidand/dev/auto-llama/hello7.py", line 74, in <module>
    response = client.inference.chat_completion(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_utils/_utils.py", line 275, in wrapper
    return func(*args, **kwargs)
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/resources/inference.py", line 218, in chat_completion
    self._post(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1263, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 955, in request
    return self._request(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1043, in _request
    return self._retry_request(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1092, in _retry_request
    return self._request(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1043, in _request
    return self._retry_request(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1092, in _retry_request
    return self._request(
  File "/Users/aidand/dev/auto-llama/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
llama_stack_client.InternalServerError: Error code: 500 - {'detail': 'Internal server error: An unexpected error occurred.'}

💡 Why is this needed? What if we don't build it?

  • If users want to adopt llama-stack they'll be re-writing their current apps from OpenAI - so they'll be running into these errors.
  • This makes that task less of a bad experience

One idea is to just propagate the 400 we get from fireworks:

fireworks.client.error.InvalidRequestError: {'error': {'object': 'error', 'type': 'invalid_request_error', 'message': "JSON Schema not supported: could not understand the instance `{'name': 'Plan', 'description': 'A plan to complete the task of creating a codebase that will a web server that has an API endpoint that translates text from English to French..', 'strict': True, 'schema': {'type': 'object', 'properties': {'steps': {'type': 'array', 'items': {'type': 'string'}}}, 'required': ['steps'], 'additionalProperties': False}}`."}}

aidando73 avatar Dec 18 '24 21:12 aidando73