openai-python icon indicating copy to clipboard operation
openai-python copied to clipboard

Add Structured Outputs support to Assistants stream() and create_and_poll() Functions

Open sciencetor2 opened this issue 1 year ago • 1 comments

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

  • [X] This is a feature request for the Python library

Describe the feature or improvement you're requesting

Currently the client.beta.threads.runs.create_and_poll() function and client.beta.threads.runs.stream() function do not accept a pydantic model as their "response_format". currently they only accept the old {"type": "json_object"} value.

Additional context

class Meal(BaseModel):
    meal: str
    slug: str
    recipe_id: str
    calories_per_serving: int
    protein_per_serving: int
    fat_per_serving: int
    carbs_per_serving: int
    servings: int

class Meals(BaseModel):
    breakfast: Optional[Meal]
    lunch: Optional[Meal]
    dinner: Optional[Meal]

class DayLog(BaseModel):
    date: str  # You can change this to 'date' type if needed
    total_calories: int
    total_carbs: int
    total_fat: int
    total_protein: int
    meals: Meals

class WeekLog(BaseModel):
    Monday: DayLog
    Tuesday: DayLog
    Wednesday: DayLog
    Thursday: DayLog
    Friday: DayLog
    Saturday: DayLog
    Sunday: DayLog

completion = client.beta.chat.completions.parse(
        model="gpt-4o-2024-08-06",
        messages=[
            {"role": "system", "content": "my prompt for structured data"


             },
        ],
        response_format=WeekLog,
    )

Currently the above works without issue, but the below throws a TypeError:

assistant = client.beta.assistants.create(
        name="Meal Planner Nutritionist",
        instructions="some instructions",
        tools=[{"type": "code_interpreter"}],
        model="gpt-4o-2024-08-06",
    )
    thread = client.beta.threads.create()
    message = client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content= "my prompt for structured data"
         )
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread.id,
        assistant_id=assistant.id,
        instructions="repeat instructions",
        response_format=WeekLog
    )

and the below works, but isnt usable for my purposes:

assistant = client.beta.assistants.create(
        name="Meal Planner Nutritionist",
        instructions="some instructions",
        tools=[{"type": "code_interpreter"}],
        model="gpt-4o-2024-08-06",
    )
    thread = client.beta.threads.create()
    message = client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content= "my prompt for structured data"
         )
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread.id,
        assistant_id=assistant.id,
        instructions="repeat instructions",
        response_format={"type": "json_object"}
    )

sciencetor2 avatar Sep 09 '24 14:09 sciencetor2

I think I can handle this issue. I will open PR as soon as possible.

AnneMayor avatar Mar 08 '25 14:03 AnneMayor

Any updates?

For consistency, if we create/update an assistant if a fully specified json schema like so:

response_format=json_schema

Then we should also apply the same conditions when running the thread?

    current_run = await async_openai_client.beta.threads.runs.create_and_poll(
        thread_id=azure_thread_id,
        assistant_id=assistant_id,
        response_format={"type": "json_object"}  # using json_schema here fails
    )

afogarty85 avatar Apr 04 '25 18:04 afogarty85

@afogarty85 I think it makes sense to support both JSON Schema and Pydantic models for extensibility. What do you think?

AnneMayor avatar Apr 05 '25 07:04 AnneMayor

@sciencetor2 I think this issue could be resolved by github bot. I figured out that create_and_poll() and stream() function both called _transform(). And _transform() function called _tranform_recursive() function after all. I had checked that _transform_recursive() was edited as supporting json serializer from pydantic model by bot. The code line is between 198 and 199 from _transform.py. I will attach the code below and please review it if you think this issue had been already resolved.

    if isinstance(data, pydantic.BaseModel):
        return model_dump(data, exclude_unset=True, mode="json")

FYI. bot was committed this code at Nov 11 4 2024. The commit id is d21cd6c0.

AnneMayor avatar Apr 05 '25 09:04 AnneMayor

Hi Anne, I'll pull down the latest version and let you know in the next day or so. Thanks!

sciencetor2 avatar Apr 06 '25 17:04 sciencetor2