LLMLingua
LLMLingua copied to clipboard
[Feature Request]: Instructions how to use metal backend (Apple silicon)
Is your feature request related to a problem? Please describe.
Current defaults only work with CUDA. It would be great to be able to run it on other platforms like metal too. Is this already possible?
Describe the solution you'd like
An automatic selection based on available backends. As inspiration see https://github.com/pmeier/light-the-torch
Alternatively an option to set it manually.
Additional context
No response
To run it on apple silicon hardware just pass device_map="mps" when instantiating the PromptCompressor
Works, thanks a lot!
Here's an example for the impatient:
from llmlingua import PromptCompressor
llm_lingua = PromptCompressor(
device_map="mps",
model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank",
use_llmlingua2=True, # Whether to use llmlingua-2
)
compressed_prompt = llm_lingua.compress_prompt("Give me some efficient python code for the classical mandelbrot fractional", rate=0.5, force_tokens = ['\n', '?'])
compressed_prompt["compressed_prompt"]