self-rag
self-rag copied to clipboard
FactScore Inference Fails with KeyError: 'original_splitted_sentences'
Hello, thanks for your amazing work!
I want to ask questions about an error KeyError: 'original_splitted_sentences'
that I encountered when trying to generate results for FactScore.
Error
When I run run_long_form_static.py
for FactScore following the command shown in "Run inference using pre-retrieved passages" in README.md, I encounter:
KeyError: 'original_splitted_sentences'
The error originates from the following line:
"cat": item["cat"], "intermediate": intermediate["original_splitted_sentences"][0]})
This error seems to be the same as issue #76 . However, since that issue was retracted, I am reposting it here.
Culprit?
The error occurs when do_retrieve == False
, and the culprit seems to be:
if do_retrieve is False:
...
prediction_tree = {}
return preds[0], prediction_tree
at here, since it always return prediction_tree = {}
, resulting in KeyError
Another issue: always no retrieval
Upon investigating this error, I also found that no retrieval occurs unless using --mode always_retrieve
(i.e., do_retrieve
is always False
even when using adaptive_retrieval
or default
). Therefore, when I run run_long_form_static.py
with the same flags specified in README.md, it always goes to the if do_retrieve is False
path, causing the above error.
Adding the --mode always_retrieve
flag solves the error, but I'm not sure if it was accidentally omitted from the instruction command.
Also, I am not sure that always being do_retrieve == False
is an expected behaviour here - it seems not to be.
Questions
Q1. Is the --mode always_retrieve
flag missing from the command instructions for FactScore, or is the command correct and the cause of the error lies elsewhere?
Q2. With mode == "adaptive_retrieval”
and mode == "default”
, it appears to always go to do_retrieve == False
, but is this expected behavior?
Thanks!