Structured outputs issues with open source models
- [✓] This is actually a bug report.
- [✓] I am not getting good LLM Results
- [✓] I have tried asking for help in the community on discord or discussions and have not received a response.
- [✓] I have tried searching the documentation and have not found an answer.
What Model are you using?
- [ ] gpt-3.5-turbo
- [ ] gpt-4-turbo
- [ ] gpt-4
- [✓] Other (please specify)
Describe the bug I was trying to design a agent that needs to receive structured outputs from LLM. I wanted to use open source models, because eventually later I want to self host the models I am using. I have run my code using models from openrouter, since they have a large collection of hosted open source models available for free. But I am unable to get structured outputs in those models that claim to support structured outputs. I have tested on a variety of models available, like:
- Llama 4 maverick
- Qwen 3 235B
- Kimi K2, etc.
I have tried changing the output mode in instructor by switching between TOOLS, JSON and OPENROUTER_STRUCTURED_OUTPUTS modes.
I have compiled a full experiment report with all the models and modes I have used in this sheet: https://docs.google.com/spreadsheets/d/1FtgdizEXzVdIW9htZo0T1gzU_0qA9v-We0_eUyKPhZk/edit?usp=sharing.
I have a couple of questions:
- These models I have tested with claim to support structured outputs in one way or other, but I am not able to get them using instructor. What might be the reason?
- Few models run as expected in one mode but not on others. Is there a proper documentation for the best settings of each model type? I have seen this table: https://python.useinstructor.com/modes-comparison/#mode-compatibility-table, but it only focuses mostly on OpenAI and Anthropic and Google closed source models, while having no info about LLaMA, Qwen, Kimi, Glm, Deepseek etc. Can we have such tables for them?
To Reproduce Here is the minimal code I used to run the structured output tests I linked in the sheet: https://github.com/aritroCoder/code_snippets/blob/main/instructor_test.py. Just add a openrouter key, and change the model and modes to try it yourself.
Expected behavior The models claiming to support structured outputs should ideally run and give structured output result.
If I am missing something here, or I am wrong somewhere, please feel free to correct me. I am still quite exploring this area.
Thank you