dspy
dspy copied to clipboard
Extra generations even if all field names are in the completion
Testing with Llama-3 70B, was seeing duplicate generations.
I traced it to this code in predict.py:
for completion in completions:
if all((completion.get(key, "") != "") for key in field_names):
finished_completions.append(completion)
continue
finished_completions.append(
extend_generation(completion, field_names, stage, max_depth, original_example),
)
Not exactly sure why the "if" check fails incorrectly.
I fixed it by doing this:
for completion in completions:
extend_gen = False
for key in field_names:
if key not in completion:
extend_gen = True
break
if extend_gen:
finished_completions.append(
extend_generation(completion, field_names, stage, max_depth, original_example),
)
else:
finished_completions.append(completion)
So the change is sometimes you want a field in the completion without any content?
Ah! thx for the insight. Now I understand why the current code does not work. Let's say you have an input field without content. In this case, the code will incorrectly call extend_generation() even though the output field has been generated.
The "field_names" variable contains ALL the fields, not just the output fields. The question is how to filter template.fields in line 116 to get just the output fields to build the field_names list
Here is an example of a completion which will extend incorrectly ("hint" is an input field which is empty but the output field "answer" is populated)
So... the correct fix looks like below:
for completion in completions:
if all((completion.get(key, "") != "") for key in field_names if key not in example):
finished_completions.append(completion)
continue
finished_completions.append(
extend_generation(completion, field_names, stage, max_depth, original_example),
)
@mikeedjones - does above look good to you? If yes, I can submit a PR
do you want the model to fill hint?
@mikeedjones No. Its an input field that can be empty
My proposed fix above is not enough. When there is a legitimate extend generations scenario but there is an empty input field, the code still gets into a loop and ends with "Max depth exceeded - failed to complete in one pass - increase max_tokens".
The function "get_all_fields_following_missing_field" should really be implemented as "get_all_output_fields_following_missing_output_field".
How to distinguish input vs output fields given an instance of the Example class?? Seems to me you need to pass the Signature instance as an argument all the way down
The work-around for now is to never leave an input field empty.