gptme fix: remove system outputs from tool examples in system prompt

fix: remove system outputs from tool examples in system prompt

Open ErikBjare opened this issue 11 months ago • 1 comments

I noticed even good models like Sonnet hallucinated System outputs when ran in non-streaming mode. (might be fixed by #306)

Not sure if this is actually better, should ideally evaluate with evals.

TODO:

[ ] Run tests in non-streaming mode to make sure it's not broken.
[ ] Run evals in non-streaming mode.

[!IMPORTANT] Enhance example cleaning by adding strip_system parameter to remove system messages in get_examples and clean_example.

Behavior:

get_examples in base.py now accepts strip_system parameter to remove system messages from examples.

clean_example in __init__.py updated to handle strip_system parameter, filtering out system messages and their content.

Functions:

get_tool_prompt in base.py calls get_examples with strip_system=True to exclude system messages from prompts.

Misc:

Minor refactoring in get_examples and clean_example to support new functionality.

^{This description was created by}^{for 0b1d259cc145b92e92c24c57f7e2fdb17df7abfb. It will automatically update as commits are pushed.}

Dec 10 '24 22:12 ErikBjare

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 72.94%. Comparing base (65efc3e) to head (0b1d259).

:white_check_mark: All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #320      +/-   ##
==========================================
+ Coverage   72.49%   72.94%   +0.45%     
==========================================
  Files          67       67              
  Lines        4912     4931      +19     
==========================================
+ Hits         3561     3597      +36     
+ Misses       1351     1334      -17

Flag	Coverage Δ
anthropic/claude-3-haiku-20240307	`71.02% <100.00%> (+0.47%)`	:arrow_up:
openai/gpt-4o-mini	`69.51% <100.00%> (+0.11%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Dec 10 '24 22:12 codecov-commenter

gptme gptme copied to clipboard

fix: remove system outputs from tool examples in system prompt

Codecov Report

gptme
gptme copied to clipboard