hydra icon indicating copy to clipboard operation
hydra copied to clipboard

IAC for AWS Batch Infra

Open coder46 opened this issue 3 years ago • 0 comments

Create terraform scripts to launch aws batch infra for training. Import components

  1. Separate queues and compute environments for CPU vs GPU workloads
  2. Use launch templates to attach additional disk storage to training instances
  3. Alerting for jobs queuing up in Batch states
  4. Slack alerting configuration
  5. FIne grained IAM perms

coder46 avatar Mar 01 '21 07:03 coder46