mu
mu copied to clipboard
AWS Batch support via mu?
- What do you think about adding AWS Batch support into mu?
- What do you think about Batch compute env being wrapped into mu env?
Example of Batch compute environment, Batch queue and Batch job definitions:
environments:
- name: dev
...
# AWS batch compute env
batchComputeEnv:
- name: myBatchEnv
type: managed | unmanaged
serviceRole: # ECS IAM role
instanceRole:
keyName: # ec2 key pair name
# compute resources
provisionModel: onDemand | spot
allowedInstances: optimal | <c4.large> ... # Optimal chooses the best fit of M4, C4, and R4 instance types available in the region.
minimumVCPUs:
desiredVCPUs:
maximumVCPUs:
imageId: # AMI id
# networking
# allow to use ENV vpcTarget settings (by avoiding vpcId and instanceSubnetIds here)
vpcId:
instanceSubnetIds:
securityGroups:
# Batch queues
# Job queues with a higher integer value for priority are given preference for compute resources.
# Jobs are submitted to the connected compute environments based on the order they are listed and the available capacity of those environments.
batchQueues:
- name: priority1
priority: 1
computeEnvs:
- myBatchEnv
- name: priority5
priority: 5
computeEnvs:
- myBatchEnv
- name: priority10
priority: 10
computeEnvs:
- myBatchEnv
# AWS batch job definitions
batchJobs:
- name: myBatchJob1
jobRole: # ECS IAM role
containerImage: amazonlinux
command: # (optional)
vCPUs: 2
memory: 100
attempts: 1
execTimeout: 100 # Time (in seconds) to allow each job attempt to run. If your job runs longer than the specified time, it will stop and be moved to FAILED.
uLimits:
- name: CORE
soft: 10
hard: 80
parameters:
param1: value1
environment:
envvar1: val1
# security
priviledged: true
user: nobody
# volumes
volumes:
volname: sourcepath
readOnlyFilesystem: false
mountPoints:
- containerPath: '/srv/www'
sourcePath: '/opt/build'
readOnly: false
Created following POC:
- batch Compute Environments and Job Queues handled via mu extension
- batch pipeline and job definition handled via new 'batch' entity in mu.yml
- job definition deploy is done via
mu batch deploy
Here is the code for items 2 and 3: https://github.com/stelligent/mu/compare/develop...AndreyMarchuk:feature/batch-job?expand=1
Now the question is: would it be better handled as Service.Provider = batch
?
I think having aService.Provider = batch
would be the most ideal as it avoids a lot of duplication. how plausible would it be to implement?
POC for Service.Provider = batch
- currently usable as is (batch env provisioned via extension)
- batch deployments (job definition registration) do not depend on existing environment
- could be a starting point for deployments without LB
- definitely less changes and duplication compared to independent batch entity implementation
- todo: support all batch job definition params via mu.yml
- todo: ability to provision batch environment using
Environment.Provider = batch
- todo: tests
https://github.com/stelligent/mu/compare/develop...AndreyMarchuk:batch-as-service-provider?expand=1
looking good! curious, what is this for?
ProviderOverride string `yaml:"provider,omitempty"`
https://github.com/stelligent/mu/compare/develop...AndreyMarchuk:batch-as-service-provider?expand=1#diff-5db7db86b937470e53496a6ce29a1d3dR160
Currently mu tries to fetch the Env Stack to get the Provider from the environment. ProviderOverride allows to specify Provider on Service level so that Env Stack does not have to exist. Batch job definition registration does not depend on environment.
service:
name: my-batch-job
# deployed as AWS Batch job
provider: batch
It also forces service to be treated as batch even if user mistakenly deploys the service onto non-batch environment (i.e. ecs, ec2 etc)