terraform-aws-github-runner
terraform-aws-github-runner copied to clipboard
fix: Add delay to prevent ssm rate limits if instance count is greater than max throughput
Fix for bug #1841 which prevents hitting AWS SSM rate limits by adding a delay between subsequent putParameter calls.
Note: 25ms delay is based on the AWS service quotas as they only allow 40 requests per second to the parameter store by default.
[Tests]:
- Pre commit checks ✅
- Ran build locally ✅
- Ran linter ✅
- Ran jest tests ✅
Related PR (never got merged): https://github.com/philips-labs/terraform-aws-github-runner/pull/1843
@npalm Looks like the jest step is failing because the coverage is dropping below 92%, however all the tests itself are passing.
Jest: "global" coverage threshold for branches (92%) not met: 91.37%
Guess this is caused since condional delay black is not touched in the test. This requires to add a test with adding 40 runners. Right now not more time to dig in further. Will check later.
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.
@npalm Any chance you can look at this? I tried adding some tests with 40 runners but it didn't help.
Yes will do. But will be after next week.
@GuptaNavdeep1983 would you have tim to have a look on this PR as well?
Any update on this? Looking to upgrade the module version but cannot until we have this PR merged in.
Sorry no updates, bit restricted by time. Sorry.
@devsh4 can you have a look at the recent changes?
Ther refered AWS document only mention GetParameters to be rate limitted. The Lambda only calls putParatameter. I can't see a specific limit for the putParamaters. Hoever I have seen RateLimits as well when we try to scale aggressive. AWS also used a bucket to rate limite. As soon you bucket is empty ou got rate limitted. The moudle makes several AWS calls. PutParameter wil certainly tak a part in this.
But not sure if delaying the SSM put parameter calal will solve the problem. Once solving the proble here, maybe call the GetParamter from the instance start can be the next limit.
AWS service quotas
@npalm yes I believe putParameter call does play a role in hitting rate limits. We are running the forked version of this repo with the initial change I made here and it has been running seamlessly since the past 6 months for spinning up more than ~250 runners at once without hitting rate limits.
AWS service quotas
@npalm yes I believe
putParametercall does play a role in hitting rate limits. We are running the forked version of this repo with the initial change I made here and it has been running seamlessly since the past 6 months for spinning up more than ~250 runners at once without hitting rate limits.
Can you refer to the fork you running here (in case it is OS)?
@npalm Sure forked version is here, it's linked above as well. FYI we are using Linux runners.
@npalm Any update on this?
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.