LLMLingua
LLMLingua copied to clipboard
Support for remote LLM through API
Hi team,
Due to computing resources needed to run this, it would be nice if you can also add an option where user can give url_endpoint
and api_key
for a remote REST API to use for this instead of downloading the model from HuggingFace.
Hi @deltawi, thank you for your interest in and support of LLMLingua.
Currently, since API models do not provide log probabilities for the prompt end, it's challenging to directly support related requirements. However, we will incorporate this need into our future plans.
Refer to issue #44.
Hi @deltawi, thank you for your interest in and support of LLMLingua.
Currently, since API models do not provide log probabilities for the prompt end, it's challenging to directly support related requirements. However, we will incorporate this need into our future plans.
Refer to issue #44.
hi @iofu728 , is this possible to run a model on a server and point the code to use that model over its API? e.g. run a llama2 7b on a server.
Same need here. I love the concepts of LLMLingua
and they are super useful for users, however, I do not have the ability to self-host inference for any model (due to many different reasons: cost, know-how, security, capacity, etc.). I leverage Microsoft Azure AI and Fireworks AI and they have models that can apparently be used (small and fast) for LLMLingua. I'd like to have the ability to use an API for the calls that LLMLingua needs.
Any comments on whether this will make it into the roadmap?
Same need here. I love the concepts of
LLMLingua
and they are super useful for users, however, I do not have the ability to self-host inference for any model (due to many different reasons: cost, know-how, security, capacity, etc.). I leverage Microsoft Azure AI and Fireworks AI and they have models that can apparently be used (small and fast) for LLMLingua. I'd like to have the ability to use an API for the calls that LLMLingua needs.Any comments on whether this will make it into the roadmap?
Hi @afbarbaro, we support the API mode in Prompt flow, you can refer this document to use it.