herd icon indicating copy to clipboard operation
herd copied to clipboard

Utilize InstanceFleet features in EMR Cluster Mgmt

Open nateiam opened this issue 7 years ago • 0 comments

As a Herd/DM Cluster Mgmt User I want to use new EMR InstanceFleet features so I can manage spot instances more effectively.

Teams want to take advantage of the following features: Instance Fleets with Weighted Capacity, Defined Duration, Provisioning Timeout as documented in this AWS [documentation|http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-instance-fleet.html]

This story adds an InstanceFleet section to Herd's existing [EMR Cluster Definition|https://wiki.finra.org/display/DataManagement/EMR+Cluster+Definition]

Acceptance Criteria

  • New optional element exists in Cluster Definition for InstanceFleet data
    • InstanceFleet data includes all information in the instance-fleet json as described in the AWS documentation linked above
  • All InstanceFleet information gets passed to AWS when:
    • User includes InstanceFleet infromation in a Cluster Definition
    • User includes InstanceFleet infromation as an override when launching a cluster
  • Existing EMR Cluster Activiti wrapper is modified to handle InstanceFleet
  • Return 40x error if user includes both InstanceFleets and InstanceDefinitions information
    • If override creates a definition that includes both InstanceFleet and pre-existing InstanceDefinitions elements, return 40x error stating they must include only one
  • Return 40x if EMR version does not support Instance Fleet

Test Notes

  • Proposed testing approach
    • Validate all inputs get passed according to AWS spec when launching EMR
    • New smoke test for happy path - but it has to be something that is predictable/reliable

nateiam avatar May 13 '17 20:05 nateiam