Mark Sze
Mark Sze
Another hand up for this - I'm trying to utilise LiteLLM+Ollama function calling with AutoGen. I noticed that with one function it appears to populate the top-level name correctly, but...
> @marklysze Just added cache support, let me know what you think of the implementation Wow, that was fast! I'll check it out :) much appreciated.
Just a question, not related specifically to text compression but to TransformMessages, is it possible to allow the passing in of a single transform as well as the dictionary of...
When using LLMLingua, is there any way to suppress the warning: `Token indices sequence length is longer than the specified maximum sequence length for this model (521 > 512). Running...
What would be the best way to avoid text compression on certain messages? E.g. for a debating scenario group chat, I added the TextCompression to the select speaker (auto) functionality...
> @marklysze The compressed text might not make sense to humans but it makes sense to an llm, at least that's what the research behind [LLMLingua](https://github.com/microsoft/LLMLingua) suggests. You can always...
> @marklysze Also just fyi, you can add custom instructions to llmlingua: > > https://github.com/microsoft/LLMLingua/blob/40ac969a82f162b3eb0b8e1f1416756d442e4eec/llmlingua/prompt_compressor.py#L424-L427 > > Which you could specify as an option in `compression_args` in the constructor of...
> > @marklysze Also just fyi, you can add custom instructions to llmlingua: > > https://github.com/microsoft/LLMLingua/blob/40ac969a82f162b3eb0b8e1f1416756d442e4eec/llmlingua/prompt_compressor.py#L424-L427 > > Which you could specify as an option in `compression_args` in the constructor...
Just noticed that if a message content is empty, then I don't think it should check the cache or compress. So could we also check here if the content is...
I just updated to 0.1.23 and tried to pull a model and it started off at full speed but near the end it slowed down to a crawl (as it...