LLMLingua [Question]: LLMLingua-2 Sample-Wise Dynamic Compression Ratio

Describe the issue

Hi,

I have two questions:

Appendix L of the LLMLingua-2 paper talks about allowing the compressor to adjust the compression rate for different samples, but I cannot find any documentation about this in the git repo and looking at compress_prompt_llmlingua2() it seems like it is not possible? Also, I dont quite understand from the explanation in appendix L, how this dynamic compression is supposed to work, where can I find more details?
What is the use_context_level_filter parameter for?

Aug 02 '24 13:08 cornzz

Hi @cornzz, thanks for your interest in LLMLingua.

First, you can find detailed documentation at this link.
In Appendix L, the DCR is actually determined by using the compressor predictor's output as an indicator to allocate the compression ratio. However, this feature hasn't been added to the library yet. [ToDo] @pzs19
The "use_context_level_filter" controls whether to apply coarse-level prompt compression.

Aug 22 '24 05:08 iofu728

@iofu728 thanks a lot for your response!

Regarding 3: is this what is referred to in the last paragraph of section 4.2 in the paper?

our approach can be readily integrated into the coarse-to-fine framework proposed in LLMLingua (Jiang et al., 2023a), allowing for a higher compression ratio of ∼15x for tasks involving multiple demonstrations or documents.

Aug 22 '24 13:08 cornzz