LLMLingua
LLMLingua copied to clipboard
use other quant formats
Is it would be hard to use exl2 for same purpose ? Or openai compatible api ?
Hi @zba,
Thank you for your interest and support in LLMLingua. I believe there are no block issues with using the exl2 format. You can try replacing the code at LLMLingua Prompt Compressor with ExLlamaV2.
For the OpenAI format, you can use the latest API to obtain log probabilities and set max_tokens to 0. This will help you get the log probabilities for the prompt portion.