LLMLingua [Feature Request]: Instructions how to use metal backend (Apple silicon)

Is your feature request related to a problem? Please describe.

Current defaults only work with CUDA. It would be great to be able to run it on other platforms like metal too. Is this already possible?

Describe the solution you'd like

An automatic selection based on available backends. As inspiration see https://github.com/pmeier/light-the-torch

Alternatively an option to set it manually.

Additional context

No response

Sep 20 '24 14:09 do-me

To run it on apple silicon hardware just pass device_map="mps" when instantiating the PromptCompressor

Sep 27 '24 11:09 cornzz

Works, thanks a lot!

Here's an example for the impatient:

from llmlingua import PromptCompressor

llm_lingua = PromptCompressor(
    device_map="mps",
    model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank",
    use_llmlingua2=True, # Whether to use llmlingua-2
)
compressed_prompt = llm_lingua.compress_prompt("Give me some efficient python code for the classical mandelbrot fractional", rate=0.5, force_tokens = ['\n', '?'])
compressed_prompt["compressed_prompt"]

Oct 09 '24 12:10 do-me