autogen Facing Issue while working with Groq openAI compatible endpoint

What happened?

when trying structured output with groq OpenAI compatible apis getting error:

Traceback (most recent call last):
  File "/Users/shubhamgilada/Developer/github/autogen_test/test_qroq.py", line 46, in <module>
    asyncio.run(_runner())
  File "/Users/shubhamgilada/miniconda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/shubhamgilada/miniconda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Users/shubhamgilada/Developer/github/autogen_test/test_qroq.py", line 36, in _runner
    response = await model_client.create(messages=messages)
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/autogen_ext/models/openai/_openai_client.py", line 622, in create
    result: Union[ParsedChatCompletion[BaseModel], ChatCompletion] = await future
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 437, in parse
    return await self._post(
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 1767, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 1461, in request
    return await self._request(
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 1562, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'response_format.type' : value is not one of the allowed values ['text','json_object']", 'type': 'invalid_request_error'}}

code for reproducing:

import os
import asyncio
from autogen_core.models import UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
from typing import Literal
from pydantic import BaseModel
from autogen_core.models import ModelFamily

api_key = os.environ.get("GROQ_API_KEY")


class AgentResponse(BaseModel):
    thoughts: str
    response: Literal["happy", "sad", "neutral"]


model_client = OpenAIChatCompletionClient(
    base_url="https://api.groq.com/openai/v1",
    model="meta-llama/llama-4-scout-17b-16e-instruct",
    api_key=api_key,
    response_format=AgentResponse,
    model_info={
        "vision": False,
        "function_calling": True,
        "json_output": True,
        "structured_output": True,
        "family": ModelFamily.UNKNOWN,
    },
)
messages = [
    UserMessage(content="I am happy.", source="user"),
]


async def _runner():
    response = await model_client.create(messages=messages)
    assert isinstance(response.content, str)
    parsed_response = AgentResponse.model_validate_json(response.content)
    print(parsed_response.thoughts)
    print(parsed_response.response)

    # Close the connection to the model client.
    await model_client.close()


asyncio.run(_runner())

Which packages was the bug in?

Python Extensions (autogen-ext)

AutoGen library version.

Python 0.5.1

Other library version.

No response

Model used

meta-llama/llama-4-scout-17b-16e-instruct

Model provider

Other (please specify below)

Other model provider

groq

Python version

3.10

.NET version

None

Operating system

MacOS

Apr 09 '25 14:04 gilada-shubham

@gilada-shubham ,

A good place to start troubleshooting this is to first verify if the Groq endpoint you have indeed supports structured_output using the OpenAI library. There is a chance that is is not supported.

Can you confirm if the model provider and model itself (LLAMA 4) supports structured outputs using the pydantic format? For example does the openai official structured output code below work correctly using the openai cleint directly and your groq endpoint?

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

Apr 10 '25 02:04 victordibia

@victordibia Thank you for the response

from pydantic import BaseModel
from openai import OpenAI
api_key = os.environ.get("GROQ_API_KEY")
client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key=api_key
)


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]


completion = client.beta.chat.completions.parse(
    model="meta-llama/llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed
print(event)

also had same issue

Traceback (most recent call last):
  File "/Users/shubhamgilada/Developer/github/autogen_test/test_groq.py", line 20, in <module>
    completion = client.beta.chat.completions.parse(
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 158, in parse
    return self._post(
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 1242, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 919, in request
    return self._request(
  File "/Users/shubhamgilada/miniconda/lib/python3.10/site-packages/openai/_base_client.py", line 1023, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'response_format.type' : value is not one of the allowed values ['text','json_object']", 'type': 'invalid_request_error'}}

i think they supports json_output not strctured_outout

import os
from pydantic import BaseModel
from openai import OpenAI
api_key = os.environ.get("GROQ_API_KEY")
client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key=api_key
)


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]


completion = client.beta.chat.completions.parse(
    model="meta-llama/llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "Extract the event information in JSON"},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format={ "type": "json_object" }
)

event = completion.choices[0].message
print(event)

this works, is there any way to use json_output instead?

Apr 10 '25 04:04 gilada-shubham

I was facing the same problem—passing a Pydantic class directly wasn't working as expected. Instead, I found a workaround by using structured JSON output and prompting the model with the expected schema.

Here's a minimal working example that uses a custom system_message to instruct the model to return the output in a specific JSON format:

import asyncio
from pydantic import BaseModel
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from api_keys import GROQ_API_KEY

async def main() -> None:
    model_client = OpenAIChatCompletionClient(
        model="llama-3.3-70b-versatile",
        model_info={
            "vision": False,
            "function_calling": True,
            "json_output": True,
            "family": "unknown",
            "structured_output": True
        },
        base_url="https://api.groq.com/openai/v1",
        api_key=GROQ_API_KEY,
        response_format={"type": "json_object"},  # Important for structured JSON
    )

    system_message = """
    You are a quiz generator. When given a topic, generate an MCQ quiz with the specified number of questions.
    Return the quiz as JSON with the following format:
    {
        "questions": [
            {
                "question": "The question text",
                "options": ["Option A", "Option B", "Option C", "Option D"],
                "correct_answer": 0,  // Index of the correct answer (0 for Option A, etc.)
                "explanation": "Explanation of why this answer is correct"
            }
        ],
        "total_questions": 3,
        "difficulty": "easy"
    }
    """

    agent = AssistantAgent(
        "quiz_generator",
        model_client=model_client,
        system_message=system_message,
    )

    quiz_request = "Generate a 3-question quiz about Python programming, difficulty level: medium"
    await Console(agent.run_stream(task=quiz_request))

asyncio.run(main())

Output

---------- user ----------
Generate a 3-question quiz about Python programming, difficulty level: medium
---------- quiz_generator ----------
{
   "questions":[
      {
         "question":"What is the purpose of the 'finally' block in a Python try-except statement?",
         "options":[
            "To handle exceptions",
            "To execute code regardless of exceptions",
            "To skip exceptions",
            "To restart the program"
         ],
         "correct_answer":1,
         "explanation":"The 'finally' block is used to execute code regardless of whether an exception occurred or not."
      },
      {
         "question":"How do you create a copy of a list in Python?",
         "options":[
            "list.copy()",
            "list.clone()",
            "list.duplicate()",
            "list.repeat()"
         ],
         "correct_answer":0,
         "explanation":"The correct way to create a copy of a list in Python is by using the list.copy() method or slicing the list with list[:]."
      },
      {
         "question":"What is the difference between 'is' and '==' operators in Python?",
         "options":[
            "'is' checks for equality and '==' checks for identity",
            "'is' checks for identity and '==' checks for equality",
            "'is' checks for type and '==' checks for value",
            "'is' checks for value and '==' checks for type"
         ],
         "correct_answer":1,
         "explanation":"'is' checks if both variables point to the same object in memory (identity), while '==' checks if the values of the variables are equal."
      }
   ],
   "total_questions":3,
   "difficulty":"medium"
}

This approach avoids passing the Pydantic class directly and instead lets the model follow a well-defined schema. Worked well for my use case. Hope this helps!

Apr 19 '25 11:04 saswattulo