[Question]: LongBench BM25 reproduce

Open JUNE515 opened this issue 1 year ago • 3 comments

Describe the issue

I'm interested in your longllmlingua results on LongBench. I reproduced LongBench BM25 2,000-token constraint using ChatGPT. Unlike the your paper's results, the performance is too high. trec task score is 72.5 and most of the other tasks are also high. I would like to know how you produced the bm25 result. I'll show you the parameter I used to reproduce bm25 so I'd appreciate it if you could tell me which one is different. I use same split and parameters other tasks.(only q_format and first inst changing according to original LongBench config)

Thank you

first_inst="Please determine the type of the question below. Here are some examples of questions." q_format="{input}" question= q_format.format(input=input) instruction=first_inst contexts_list = df['ctxs'][i].split("\n") contexts_list = [ "\n".join(contexts_list[ii : ii + 4]) for ii in range(0, len(contexts_list), 4) ] compressed_prompt = llm_lingua.compress_prompt( contexts_list, instruction=instruction, question=question, target_token=1800, condition_compare=True, condition_in_question="after", rank_method="bm25", use_sentence_level_filter=False, use_token_level_filter=False, context_budget="+100", dynamic_context_compression_ratio=0.4, # enable dynamic_context_compression_ratio )

May 30 '24 15:05 JUNE515