llm-structured-output-benchmarks icon indicating copy to clipboard operation
llm-structured-output-benchmarks copied to clipboard

Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on tasks like multi-label classification, named entity recognition, sy...

Results 5 llm-structured-output-benchmarks issues
Sort by recently updated
recently updated
newest added

Hi there, interesting benchmark. Any chance to add Pydantic-ai ? I would be curious to see how well it performs compared to others

Add a framework that generates mock responses using `polyfactory`. Related to #1. ## Summary by Sourcery This pull request adds a new framework, PolyfactoryFramework, which generates mock responses using the...

## Summary by Sourcery Add the FormatronFramework to the project, enabling new tasks like multilabel classification and synthetic data generation with specific model configurations. Update the configuration file to include...

In order to have an NER model that is simpler for internal regex/CFG representations, add an NER variant that requires all fields and does not include a default value. In...

Hi, it's nice to come across a cross-library/model benchmark like this! When looking at evaluations for structured output libraries, I feel like "valid response" is such a low bar when...