feat(GuEc2App): Remove `enabledDetailedInstanceMonitoring` optional property
What does this change?
Removes the enabledDetailedInstanceMonitoring property from the GuEc2App pattern. Monitoring is now always enabled, providing EC2 metrics at a higher granularity of one minute (default is five minutes). This should allow for earlier horizontal scaling and provide more detail during incident triage.
[!NOTE]
- This change will cost roughly $3 per instance per month. We might want to consider disabling it for non-production environments.
- These are not the same as metrics from the CloudWatch Agent, which services might also be using.
This behaviour can be changed via an escape hatch:
import type { CfnLaunchTemplate } from "aws-cdk-lib/aws-ec2";
declare const myApp: GuEc2App;
const launchTemplate = myApp.autoScalingGroup.instanceLaunchTemplate.node.defaultChild as CfnLaunchTemplate;
// Disable enhanced monitoring
launchTemplate.addPropertyOverride("LaunchTemplateData.Monitoring.Enabled", false);
// OR remove the property entirely
launchTemplate.addPropertyDeletionOverride("LaunchTemplateData.Monitoring.Enabled");
See also:
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/manage-detailed-monitoring.html
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html
How to test
See updated tests.
How can we measure success?
Our services can horizontally scale earlier.
Have we considered potential risks?
This has a direct cost implication. At the time of writing, we have ~630 running instances across all accounts. At ~$3 per instance per month, this'll be a total of ~$23,000[^1]. If we only enable this on production instances, this reduces to ~$15k.
Checklist
- [ ] I have listed any breaking changes, along with a migration path [^2]
- [ ] I have updated the documentation as required for the described changes [^3]
[^1]: 630 * 3 * 12 [^2]: Consider whether this is something that will mean changes to projects that have already been migrated, or to the CDK CLI tool. If changes are required, consider adding a checklist here and/or linking to related PRs. [^3]: If you are adding a new construct or pattern, has new documentation been added? If you are amending defaults or changing behaviour, are the existing docs still valid?
🦋 Changeset detected
Latest commit: d1ee03ac0da3551638a3037b23b391f9cf877291
The changes in this PR will be included in the next version bump.
This PR includes changesets to release 1 package
| Name | Type |
|---|---|
| @guardian/cdk | Minor |
Not sure what this means? Click here to learn what changesets are.
Click here if you're a maintainer who wants to add another changeset to this PR
I am supportive of the additional cost for PROD instance, and so would like to be added as default on PROD if that possible.
@akash1810 - I took this out of review. Best of luck with the deployment.
I am supportive of the additional cost for
PRODinstance, and so would like to be added as default onPRODif that possible.
GuCDK is intentionally stage agnostic (ADR here), therefore we cannot implement the "default on PROD" behaviour unfortunately. In an attempt to provide an accessible API and DX, I've created a new mandatory property. Usage could look something like this:
const stage = props.stage;
const instanceMetricGranularity = stage === "PROD" ? "1Minute" ? "5Minute";
const myApp = new GuEc2App(this, "my-app", {
instanceMetricGranularity,
});
NOTE: An alternative API would to make this property mandatory if the ASG has a scaling policy attached to it. To do this, we'd need to perform a check similar to this - https://github.com/guardian/cdk/blob/de6d9a9f59ba333271fa4174a1c1afc9be748e0f/src/experimental/patterns/ec2-app.ts#L138-L152.