PhiCookBook icon indicating copy to clipboard operation
PhiCookBook copied to clipboard

Unable to Run Phi 3 and 3.5 vision on Mac [Chips M1 & M3] ~"Placeholder storage has not been allocated on MPS device!"

Open Harsha0056 opened this issue 11 months ago • 1 comments

I initially used a codebase from Hugging Face and successfully tested it on Google Colab with GPU compute. However, when I tried running the same code on my local system, I set device_map="auto" and the device to "mps" for input IDs. This resulted in the following error:

"Placeholder storage has not been allocated on MPS device!"

Interestingly, I tested the same setup with the Qwen Vision model, which also used "mps", and it utilized the GPU without any issues.

Could this error indicate that the Phi 3 or Phi 3.5 Vision models are not supported on macOS GPUs? Any suggestions for fixing this issue?

Below is the code from hugging face itself.

################Code#################

from PIL import Image import requests from transformers import AutoModelForCausalLM, AutoProcessor

model_id = "microsoft/Phi-3.5-vision-instruct"

model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", trust_remote_code=True, torch_dtype="auto", _attn_implementation='eager' )

processor = AutoProcessor.from_pretrained( model_id, trust_remote_code=True, num_crops=4 )

images = []

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/0052a70beed5bf71b92610a43a52df6d286cd5f3/diffusers/rabbit.jpg" images.append(Image.open(requests.get(url, stream=True).raw))

messages = [ {"role": "user", "content": "<|image_1|>\nSummarize what you see in this image."} ]

prompt = processor.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True )

inputs = processor(prompt, images, return_tensors="pt").to("mps")

generation_args = { "max_new_tokens": 1000, "temperature": 0.0, "do_sample": False, }

generate_ids = model.generate( **inputs, eos_token_id=processor.tokenizer.eos_token_id, **generation_args )

generate_ids = generate_ids[:, inputs['input_ids'].shape[1]:] response = processor.batch_decode( generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False )[0]

print(response)

Harsha0056 avatar Jan 10 '25 08:01 Harsha0056

Here are a few suggestions to help you troubleshoot and potentially resolve this issue:

Check PyTorch and Transformers Versions: Ensure that you're using compatible versions of PyTorch and the Hugging Face Transformers library. Sometimes, updating to the latest versions can resolve compatibility issues.

Verify MPS Setup: Make sure that MPS is correctly set up and available on your system. You can check this by running a simple MPS test script to confirm that MPS is functioning properly. https://discuss.pytorch.org/t/runtimeerror-placeholder-storage-has-not-been-allocated-on-mps-device/193258

Use Apple MLX Framework: If you're using Apple Silicon, consider using the Apple MLX Framework to accelerate the operation of models like Phi 3. This framework is designed to optimize performance on Apple Silicon devices. https://techcommunity.microsoft.com/blog/azuredevcommunityblog/accelerate-phi-3-use-on-macos-a-beginners-guide-to-using-apple-mlx-framework/4174656

Try CPU Instead of MPS: As a temporary workaround, you can try running the model on the CPU instead of MPS to see if the issue persists. This can help determine if the problem is specific to MPS.

Consult Hugging Face Community: If the issue persists, consider reaching out to the Hugging Face community or checking their forums for similar issues and potential solutions see https://discuss.huggingface.co/t/runtimeerror-placeholder-storage-has-not-been-allocated-on-mps-device/42999

leestott avatar Jan 21 '25 17:01 leestott