Remove unreferenced arguments from S3 endpoint ruleset
None of these arguments are referenced by the ruleset, but the set of arguments is used for caching endpoint resolution results via https://github.com/boto/botocore/blob/7ddc232fff4297fc19f89a70ce95014beb2d77bf/botocore/endpoint_provider.py#L707-L708.
This means that the entire endpoint ruleset is evaluated (only to end up with the same result) for every S3-related request with a different Key, IOW for every file stored in S3. Since evaluating the rule tree is pretty slow, this causes significant slowdowns in mass operations, not to mention memory bloat for the (useless) cache.
The three removed parameters are not in the equivalent JS SDK code either.
Refs #3528.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 93.13%. Comparing base (
6300f6b) to head (0f6e3cb). Report is 460 commits behind head on develop.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## develop #3529 +/- ##
===========================================
+ Coverage 92.94% 93.13% +0.18%
===========================================
Files 66 68 +2
Lines 14956 15324 +368
===========================================
+ Hits 13901 14272 +371
+ Misses 1055 1052 -3
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Hi @akx,
Thank you for the PR. Unfortunately those models are owned upstream by the service team and can be updated at any time. Even if we were to merge this today, if the service team makes any unrelated changes to their endpoints tomorrow, this file would just be overwritten and the keys would come back.
While these aren't used in ruleset evaluation for botocore, my current understanding is that they are used by other AWS products and cannot be removed from the upstream ruleset. I am still working on gathering more specifics.
Thank you for calling this out - it doesn't make sense for us to slow down evaluation for botocore users based on parameters we don't use. We should either come up with an alternate solution here that causes unused keys to be ignored for caching purposes or we should find a way to remove these upstream.
@SamRemis Hi, thanks for looking into this!
Yeah, I figured this isn't the correct way to fix this given these lines had been added by an automated bot commit anyway, but I wanted to get the ball rolling.
We should either come up with an alternate solution here that causes unused keys to be ignored for caching purposes [...]
As mentioned above, the JavaScript SDK has a better set of cache keys (clients/client-s3/src/endpoint/endpointResolver.ts added in https://github.com/aws/aws-sdk-js-v3/pull/6449/commits/4be028b96da4c70d0012159b17a7281cf6d4c3af, "chore: codegen sync").
It looks like the smithy-typescript-codegen codegen code checks for this exact eventuality and elides parameters that aren't used by the ruleset for purposes of cache keying. (As it happens, I wrote basically similar code earlier to prepare for this PR and based on it, it looks like that these 3 are apparently the only parameters unused across all of Botocore.)
One somewhat simple option could be to add some tribal knowledge to the resolver in botocore for this particular use: iff the ruleset is for S3, and the original set of parameters in the ruleset is this known one, then remove the three unused args before passing to caching/evaluation? Also probably a test that would fail if the S3 ruleset's parameters change, so maintainers would know to take action to adjust the code.
Another more labor-intensive option I can think of that would probably be for the best in the longer run is to also codegen ruleset evaluation into Python, instead of having some Python code evaluate the JSON at runtime.
I see the endpoint rule set got updated recently, but AISI, no changes to the unused arguments.