self-rag icon indicating copy to clipboard operation
self-rag copied to clipboard

The meaning of "_w_gs.jsonl" in evaluation data

Open qiweijian opened this issue 1 year ago • 2 comments

Thanks for your incredible work!

I notice that there are four files for short form QA in the eval data folder.

  • popqa_longtail.jsonl
  • popqa_longtail_w_gs.jsonl
  • triviaqa_test.jsonl
  • triviaqa_test_w_gs.jsonl

I am wondering what 'w_gs' means and more specifically,

  1. how the ctxs field is built.
  2. why the ctxs in popqa_longtail.jsonl and popqa_longtail_w_gs.jsonl differ in numbers and values? (some contexts in popqa_longtail_w_gs.jsonl don't have document id and score)
  3. why triviaqa_test_w_gs_df only has 7313 samples while triviaqa_test.jsonl has 11313? how it is filtered?

qiweijian avatar Mar 18 '24 05:03 qiweijian

Hi! _gs indicates that the retrieved results are further enhanced by Google Programmable Searc in addition to the original contriever top 10 documents. We added this new results in our updated manuscript, Section B.2. That's the reason the number of the contexts differ. The different number of instance seem to be odd, and I may upload incorrect file... Let me double check!

AkariAsai avatar Mar 19 '24 21:03 AkariAsai

Hi there! I'm so sorry, but where can I find the eval data folder or How can I generate it?

naknak-choi avatar Jul 09 '24 11:07 naknak-choi