skypilot icon indicating copy to clipboard operation
skypilot copied to clipboard

p4de.24xlarge not supported

Open pschafhalter opened this issue 2 years ago • 2 comments

Hi, I'm trying to run an experiment using a p4de.24xlarge instance on AWS, but I'm unable to launch the instance using skypilot. I have access to the instance via AWS on us-west-2, and am able to launch it via the web interface.

I tried the following commands and ran into the following errors:

$ sky gpunode --cloud aws --instance-type p4de.24xlarge
ValueError: Invalid instance type 'p4de.24xlarge' for cloud AWS.
$ sky launch --instance-type p4de.24xlarge -c benchmark-a100 skypilot_configs/a100.yaml
W 10-05 17:23:38 resources.py:573] image_id in resources is experimental. It only supports AWS/GCP.
ValueError: Invalid instance type 'p4de.24xlarge' for cloud AWS

p4de.24xlarge also doesn't appear when searching sky show-gpus --all.

More info on the p4de.24xlarge: https://aws.amazon.com/about-aws/whats-new/2022/05/amazon-ec2-p4de-gpu-instances-ml-training-hpc/

pschafhalter avatar Oct 06 '22 00:10 pschafhalter

It looks like the bug has to do with the service catalog not having the instance type recorded. @WoosukKwon

michaelzhiluo avatar Oct 06 '22 00:10 michaelzhiluo

I'm interested in using this instance type as well

parasj avatar Oct 06 '22 06:10 parasj