LLMLingua
LLMLingua copied to clipboard
[Question]: reproducing LongLLMLingua on the LongBench dataset.
Describe the issue
Thank you for your work.
I tried to reproduce LongLLMLingua on the LongBench dataset.
https://github.com/microsoft/LLMLingua/blob/main/examples/Code.ipynb. It seems to be a code for reproducing one of the longbench datasets, the repobench-p.
I have two questions.
- In the paper, you said you did not use the reordering strategy. But I think this ipynb code has reordering strategy. When proceeding with repoduce, may I know if I should proceed with the reordering strategy for each dataset?
- Can I apply the same parameters like this ipynb code for all LongBench dataset? If each dataset has a different parameter, can I know parameters for each dataset?
Thank you!
Hi @junepark1, apologies for the late response,
- Yes, you need to disable the reranker for now. We will update the results with the reranker enabled in the future.
- You can use the same parameters for all tasks. For other LongBench-related logic, you can refer to https://github.com/microsoft/LLMLingua/blob/main/experiments/llmlingua2/evaluation/eval_longbench.py