[Question]: ZeroScrolls reproduction

Open cornzz opened this issue 1 year ago • 0 comments

The publicly available validation split only contains a small number of samples (around 20 per task), and the ground truths are not available for the test split, which is a lot bigger. Are the results in the LLMLingua paper obtained from the validation split, or did you submit your results to https://www.zero.scrolls-benchmark.com/submission?
The ZeroScrolls dataset provides the inputs already formatted inside the prompt-templates - did you compress the inputs from the dataset as is, or did you extract the contexts from the inputs and compress only the contexts, keeping the prompt templates intact?

Oct 01 '24 20:10 cornzz