LLMLingua icon indicating copy to clipboard operation
LLMLingua copied to clipboard

llama instead of gpt

Open jwahnn opened this issue 1 year ago • 3 comments

Just a few questions about using LLMLingua.

  1. How do I adjust the code so that I am using Llama instead of GPT?
  2. The reason I am using Llama instead of GPT is because I don't want my data to be sent to any other company's server. Using Llama, is my prompt or data being sent to some server?

jwahnn avatar Jan 23 '24 03:01 jwahnn

Hi @jwahnn,

Thank you for your support of LLMLingua. You can directly use the current code to compress prompts and input them into LLaMA. Experiments in LongLLMLingua have shown that even using open-source models like LongChat-13b as LLMs can effectively understand compressed prompts.

Your concern makes sense, our compression process does not send results to any server; it is processed locally only.

iofu728 avatar Jan 23 '24 08:01 iofu728

Hi @iofu728, thanks for the input. Just a few more follow up question, though. I am slightly confused about the description provided in the main GitHub page (https://github.com/microsoft/LLMLingua#2-using-longllmlingua-for-prompt-compression). Do I just run the file that contain those lines of code? Also, your demo says "Using the LLaMA2-7B as a small language model would result in a significant performance improvement, especially at high compression ratios." Does the current version on Github use LLaMA2, then?

jwahnn avatar Jan 23 '24 08:01 jwahnn

Hi @jwahnn,

The link at https://github.com/microsoft/LLMLingua#2-using-longllmlingua-for-prompt-compression is just a quick start guide on how to use our library. For more detailed information and examples, please refer to our documentation and the examples section.

Regarding your second question, yes, you can use different models in LLMLingua by specifying the model_name parameter. The default model is "llama 2-7b".

llm_lingua = PromptCompressor(model_name="microsoft/phi-2")

iofu728 avatar Jan 24 '24 09:01 iofu728