raster-vision icon indicating copy to clipboard operation
raster-vision copied to clipboard

Modify cloudformation template to allow use of EC2 instances as well

Open theoway opened this issue 2 years ago • 3 comments

🚀 Feature

Updating the cloudformation template so that it allows to use EC2 for CPU & GPU ComputeEnvironment as well. Currently, it is fixed at using SPOT instances.

Motivation

When working with batch, many times Spot instances are not available for P instances, which prevents GPUComputeEnvironment from being deployed, resulting in stuck jobs.

Alternatives

Manually update the cloudformation template.

Additional context

See this gitter thread(I didn't know how to add chat link here 😊): Screenshot (149)

theoway avatar Sep 01 '22 20:09 theoway

I'd make a PR for this 👍

Just wanna know if we need separate ComputeEnvironemts for EC2 instances and add them to job queues: https://github.com/azavea/raster-vision/blob/99be9b788e4f234c48006d404d0a436668050975/cloudformation/template.yml#L365-L373

Or we can take input from user if they want to use EC2 or SPOT and accordingly add that to the compute environment: https://github.com/azavea/raster-vision/blob/99be9b788e4f234c48006d404d0a436668050975/cloudformation/template.yml#L300-L309

Let me know how I should resolve this.

theoway avatar Sep 01 '22 20:09 theoway

Thanks for looking into this. I think defining a new compute environment and adding it to the existing job queue makes sense.

CC @lewfish

AdeelH avatar Sep 06 '22 07:09 AdeelH

Thanks for looking into this. I think defining a new compute environment and adding it to the existing job queue makes sense.

I agree. It looks like the ComputeEnvironmentOrder field can be used to place the on demand environment at a lower priority.

lewfish avatar Sep 07 '22 14:09 lewfish