samvanity

Results 3 comments of samvanity

It's also important to keep MoE models in mind when you expand the compatibility of PowerInfer. The ceiling for consumer grade GPUs is around 3_0 for a 8x7b so if...

uninstall your pytorch first : !pip uninstall torch torchvision torchaudio -y and then reinstall it by going to the pytorch official website to get the command

JiHa, try with a different LLM for the compressor like below: from llmlingua import PromptCompressor llm_lingua = PromptCompressor("TheBloke/Llama-2-7b-Chat-GPTQ", model_config={"revision": "main"}) You should try the notebook examples first to make sure...