gptme
gptme copied to clipboard
fix: remove system outputs from tool examples in system prompt
I noticed even good models like Sonnet hallucinated System outputs when ran in non-streaming mode. (might be fixed by #306)
Not sure if this is actually better, should ideally evaluate with evals.
TODO:
- [ ] Run tests in non-streaming mode to make sure it's not broken.
- [ ] Run evals in non-streaming mode.
[!IMPORTANT] Enhance example cleaning by adding
strip_systemparameter to remove system messages inget_examplesandclean_example.
- Behavior:
get_examplesinbase.pynow acceptsstrip_systemparameter to remove system messages from examples.clean_examplein__init__.pyupdated to handlestrip_systemparameter, filtering out system messages and their content.- Functions:
get_tool_promptinbase.pycallsget_exampleswithstrip_system=Trueto exclude system messages from prompts.- Misc:
- Minor refactoring in
get_examplesandclean_exampleto support new functionality.This description was created by
for 0b1d259cc145b92e92c24c57f7e2fdb17df7abfb. It will automatically update as commits are pushed.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 72.94%. Comparing base (
65efc3e) to head (0b1d259).
:white_check_mark: All tests successful. No failed tests found.
Additional details and impacted files
@@ Coverage Diff @@
## master #320 +/- ##
==========================================
+ Coverage 72.49% 72.94% +0.45%
==========================================
Files 67 67
Lines 4912 4931 +19
==========================================
+ Hits 3561 3597 +36
+ Misses 1351 1334 -17
| Flag | Coverage Δ | |
|---|---|---|
| anthropic/claude-3-haiku-20240307 | 71.02% <100.00%> (+0.47%) |
:arrow_up: |
| openai/gpt-4o-mini | 69.51% <100.00%> (+0.11%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.