autogen compressible agent integration

Open yenif opened this issue 2 years ago • 1 comments

Hello!

I put LLM Lingua into Autogen as part of a compressible agent https://github.com/microsoft/autogen/pull/1005

Basically functional but too slow on my mac book with llama2 to really test.

I figured I'd try phi-2 but it didn't return past_key_values, I have no clue if that is a dead end or fixable :-)

Would appreciate any input on effectively using LLM Lingua as the compressor for gpt agents.

Thanks!

Dec 18 '23 07:12 yenif

Hello @yenif,

Thank you for your help and support. We agree that agent scenarios like AutoGen are well-suited for approaches like LLMLingua to reduce token redundancy. However, there might be new issues that need to be addressed.

The reason phi-x models cannot be directly invoked is that phi-x does not inherit from HuggingFace's AutoCausalLM to build the code https://huggingface.co/microsoft/phi-2/blob/main/modeling_phi.py#L960, which results in LLMLingua being unable to call the kv-cache. One solution could be to rebuild the phi-x code within the AutoCausalLM framework.

Dec 18 '23 10:12 iofu728