llm-structured-output-benchmarks
llm-structured-output-benchmarks copied to clipboard
Add NER model variant with required fields
In order to have an NER model that is simpler for internal regex/CFG representations, add an NER variant that requires all fields and does not include a default value.
In particular, this makes it possible to evaluate a version of NER for Outlines and provides an additional point of comparison for other libraries.
Summary by Sourcery
Add a new NER model variant 'ner_required_fields' to support simpler internal representations by requiring all fields without default values. Update configuration and experiment framework to accommodate this new task.
New Features:
- Introduce a new NER model variant called 'ner_required_fields' that requires all fields and does not include default values, facilitating simpler internal regex/CFG representations.
Enhancements:
- Update the configuration to include the 'ner_required_fields' task with specific parameters for model initialization and execution.
- Modify the experiment framework to support the 'ner_required_fields' task, allowing it to be executed alongside existing tasks.
Reviewer's Guide by Sourcery
This pull request introduces a new NER (Named Entity Recognition) model variant with required fields. The changes primarily affect the configuration, base framework, data models, and main execution file. The new variant aims to simplify internal regex/CFG representations and provide an additional point of comparison for other libraries.
File-Level Changes
| Change | Details | Files |
|---|---|---|
| Added a new NER task variant with required fields |
|
config.yamlframeworks/base.pydata_sources/data_models.py |
| Updated task lists and conditional logic to include the new NER variant |
|
frameworks/base.pymain.py |
| Adjusted metric calculation to support the new NER variant |
|
frameworks/base.pymain.py |
Tips
- Trigger a new Sourcery review by commenting
@sourcery-ai reviewon the pull request. - Continue your discussion with Sourcery by replying directly to review comments.
- You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
- You can always contact us if you have any questions or feedback.
I have no attachment to the name NERRequiredFields, you can treat it as a placeholder.
Some example results (just 1 run instead of 10, on an RTX A5000, I'm not sure what's going on with the LMFormatEnforcer latency):
Reliability
Outlines 1.00
Formatron 0.99
LMFormatEnforcer 0.98
Latency_p95(s)
Formatron 16.950
Outlines 31.033
LMFormatEnforcer 45.598
framework micro_precision micro_recall micro_f1
0 Outlines 0.656250 0.546243 0.596215
1 Formatron 0.762590 0.614493 0.680578
2 LMFormatEnforcer 0.648464 0.562130 0.602219
The LMFormatEnforcer F1 is a lot higher than for the current NER (and my initial expectation was that required fields version would lower the performance).