LLMLingua llama instead of gpt

Just a few questions about using LLMLingua.

How do I adjust the code so that I am using Llama instead of GPT?
The reason I am using Llama instead of GPT is because I don't want my data to be sent to any other company's server. Using Llama, is my prompt or data being sent to some server?

Jan 23 '24 03:01 jwahnn

Hi @jwahnn,

Thank you for your support of LLMLingua. You can directly use the current code to compress prompts and input them into LLaMA. Experiments in LongLLMLingua have shown that even using open-source models like LongChat-13b as LLMs can effectively understand compressed prompts.

Your concern makes sense, our compression process does not send results to any server; it is processed locally only.

Jan 23 '24 08:01 iofu728

Hi @iofu728, thanks for the input. Just a few more follow up question, though. I am slightly confused about the description provided in the main GitHub page (https://github.com/microsoft/LLMLingua#2-using-longllmlingua-for-prompt-compression). Do I just run the file that contain those lines of code? Also, your demo says "Using the LLaMA2-7B as a small language model would result in a significant performance improvement, especially at high compression ratios." Does the current version on Github use LLaMA2, then?

Jan 23 '24 08:01 jwahnn

Hi @jwahnn,

The link at https://github.com/microsoft/LLMLingua#2-using-longllmlingua-for-prompt-compression is just a quick start guide on how to use our library. For more detailed information and examples, please refer to our documentation and the examples section.

Regarding your second question, yes, you can use different models in LLMLingua by specifying the model_name parameter. The default model is "llama 2-7b".

llm_lingua = PromptCompressor(model_name="microsoft/phi-2")

Jan 24 '24 09:01 iofu728

LLMLingua LLMLingua copied to clipboard

llama instead of gpt

LLMLingua
LLMLingua copied to clipboard