Huiqiang Jiang comments

Results 154 comments of


                                            Huiqiang Jiang

Which version of openai should be installed to reproduce gsm8k with llmlingua?

Hi @LYH-YF, the GSM8K experiment is based on the **GPT-3.5-Turbo-0301 completion** model. Due to recent changes in OpenAI's API, the 3.5-turbo-0301 completion mode is no longer available, but it can...

RuntimeError: The expanded size of the tensor (181) must match the existing size (211) at non-singleton dimension 0

Hi @kofuya, Thanks for your support in our project. Could you give me more context, like the original prompt.

Support for remote LLM through API

Hi @deltawi, thank you for your interest in and support of LLMLingua. Currently, since API models do not provide log probabilities for the prompt end, it's challenging to directly support...

Support for remote LLM through API

> Same need here. I love the concepts of `LLMLingua` and they are super useful for users, however, I do not have the ability to self-host inference for any model...

AssertionError: Torch not compiled with CUDA enabled

Thank you @samvanity for the clarification; that's correct. Hi @Avkashhirpara, you can switch the kernel environment using different 'device_map' settings by following @samvanity's. Hi @JiHa-Kim, I think this error might...

Speed Up Compression

Hi @pathquester, thank you for your support of LLMLingua. In the current implementation, the latency of quantization models is not significantly different from that of full-precision models; it might even...

Speed Up Compression

Hi @pathquester, Thanks to the efforts of the community, `phi-2` is now available for use in LLMLingua. Before using it, please update your transformers to the GitHub version by running...

Speed Up Compression

Yeah, you can also try to use the GPTQ version like `TheBloke/phi-2-dpo-GPTQ`.

Speed Up Compression

Hi @pathquester, based on our experience, even GPT2-small can achieve satisfactory results with moderate compression rates.

Some questions about parameters?

Hi @XiaoFengbing, thank you for your interest in LLMLingua. I'll briefly answer your question: 1. You can consider the control coefficient parameter 'k' defined in the paper as equivalent to...