ctransformers icon indicating copy to clipboard operation
ctransformers copied to clipboard

How do I make a model use mps?

Open entityJY opened this issue 2 years ago • 6 comments

I'm working on a Mac with an M2 chip. I've installed ctransformers with metal support, and am setting up the model like below. However, when I check what device the model is using, it outputs cpu. Screenshot 2023-09-12 at 10 27 55 PM Am I not setting up the model to use mps properly?

entityJY avatar Sep 13 '23 05:09 entityJY

I did this on m1 but I didn't use hf=True. Did you run any test and did you install pytorch using the metal instructions? Because by default, I tested that installing ctransformers with mps support does not install pytorch.

wheynelau avatar Sep 13 '23 07:09 wheynelau

I did this on m1 but I didn't use hf=True. Did you run any test and did you install pytorch using the metal instructions? Because by default, I tested that installing ctransformers with mps support does not install pytorch.

To use the tokenizer, hf has to equal True. Also, I've installed PyTorch with mps support and have checked using print(torch.backends.mps.is_available())

entityJY avatar Sep 13 '23 07:09 entityJY

I might have made a mistake, you are right.

Anyways I checked the code for the loading for huggingface models, it doesn't seem like there was any moving to device. Perhaps we can wait for one of the developer's answer.

wheynelau avatar Sep 13 '23 08:09 wheynelau

Update on this?

netneko21 avatar Sep 29 '23 14:09 netneko21

Same issue. Any Updates?

pabvald avatar Oct 27 '23 13:10 pabvald

Hi I have the same issue. I can load the pipeline on maps, but I can't load the model on 'mps', but only a cpu. I followed these steps:

  1. I installed ctransformers library in this way: CT_METAL=1 pip install ctransformers --no-binary ctransformers
  2. I tried to load model on 'mps', but I had no results. I tried so: MODEL_NAME = "TheBloke/Llama-2-7B-32K-Instruct-GGUF" llama_model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, model_file="llama-2-7b-32k-instruct.Q5_K_S.gguf", model_type="llama", hf=True, gpu_layers = 50 )

But If I check the device llama_model.device I have device(type='cpu'). Also if I try llama_model = llama_model.to('mps'), if I check I have: device(type='cpu') Any suggestion here in order to fix this issue, please? Thank you

giovanniOfficioso avatar Nov 22 '23 23:11 giovanniOfficioso