FactualCorrectness() failed to fix bad intermediate output
Sometimes, FactualCorrectness() produces incorrect intermediate output. Example: "claims":[ "the number is 6", "the number is 9", ]
Then, it tries to fix the error.
However, even though it regenerates the output that meets the requirements, it still ends up with an error.
I tried debugging with breakpoints and observed that in the code that fixes incorrect intermediate output, , it didn't seem to work as expected. (in \ragas\prompt\pydantic_prompt.py line 404 - line 421)
Starting at line 405, the program attempts to fix the nonconforming output, and then, after it finishes executing fixed_output_string, it should continue to the blue box. However, I found that it would jump back to line 405 weirdly, and when the number of retries reached the preset number, it would execute to the line 420 behind it.
Eventually, it failed because the output was over-fixed to the wrong format (I'm not sure if that's why it failed). But as it turns out, once I trigger OutputParserException, it's always unfixable, and the final output score is NaN.
my ragas version is 0.2.14
Hi @dosu Is there any way to make the output of FactualCorrectness() valid without entering the retry state? I often have cases where the json generated for the first time is invalid.
To make the output of FactualCorrectness() valid without entering the retry state when the JSON generated for the first time is invalid, you can modify the prompt used in the FactualCorrectness metric to include an instruction that ensures valid JSON output. Here is how you can do it:
prompt = scorer.get_prompts()["single_turn_prompt"]
prompt.instruction += "\nOnly output valid JSON."
scorer.set_prompts(**{"single_turn_prompt": prompt})
This modification adds a directive to the prompt's instruction to ensure that the output is valid JSON, which can help prevent entering the retry state due to invalid JSON [1].
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other