moto
moto copied to clipboard
Techdebt: Get regions from SSM instead of Boto3
Fixes #6078
Moto used, for the vast majority of services, the botocore.get_available_regions
-method to determine in what regions a particular service is available.
This method was always a bit flaky - it would take along time for new services to be available, and every now and then a new release would break the list for an existing service.
AWS' recommendation is to use the SSM parameter store instead, to determine per-region availability - see https://github.com/aws/aws-sdk/issues/206#issuecomment-1471354853
Note that we're duplicating the regions for every service. The alternative would be that every service searches the entire SSM parameter store manually, which has a significant performance impact.
Marking this as draft, as this is a fairly large change, and I think I want to make this a minor version release (4.2.0)
Maybe I'm not understanding the issue correctly, but here are some thoughts on this:
- This is only a problem for people using
moto
in server mode, right? I mean, if they're using Python andbotocore
, they're not going to be able to create a client for a service in a particular region unless it exists in theget_available_regions
method response in the first place. - If they're using a different SDK (e.g. Java) and they create a client for a service/region that is not listed in
get_available_regions
for whatever version ofbotocore
is pulled in bymoto
, then presumablymoto
will raise an error. But we already have a solution to this in the form of theMOTO_ALLOW_NONEXISTENT_REGION
setting, right? -
botocore
already provides a method for overriding the defaultendpoints.json
file that ships with the package. You can drop your own JSON file (with whatever regions you want to add/customize) into~.aws/models
. Seems reasonable formoto
to require someone spinning up amoto
server to drop in their ownendpoints.json
file if they're using new/exotic/experimental regions that aren't yet shipping withbotocore
(or, alternately, just spin up the server withMOTO_ALLOW_NONEXISTENT_REGION=true
). - I don't see how your proposed solution actually solves what I perceive the core issue to be--namely, that the
endpoints.json
file that ships withbotocore
isn't always up-to-date. Unless I'm missing something entirely, this solution seems to be proposing a separate, difficult-to-keep-updated file for every service!
Hey @bpandola, thanks for taking the time to look at this!
Regarding your first point: this is a problem for everyone, because botocore
does not validate the region. For example, AMP (Prometheus) is not available in me-central-1
, so running this code:
m_east = boto3.client("amp", "me-central-1")
m_east.list_workspaces()
Will actually try to connect to a non-existing URL:
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
[...]
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://aps.me-central-1.amazonaws.com/workspaces"
The core issue is the fact that get_available_regions
may be deprecated in the future: see https://github.com/aws/aws-sdk/issues/206#issuecomment-1471354853
Most notably:
[T]he Python SDK team is currently investigating paths forward for [get_available_regions()], and deprecation is one possible outcome.
If completeness and recency of information is important for your use, switching to the SSM API is, and has been, the recommended approach.
The fact that endpoints.json
is sometimes out of date, incomplete, or removes data inbetween versions is a nuisance - but if/when get_available_regions
is no longer available, we will always be forced to switch to a different solution. And we may as well find a different solution now, rather then waiting for things to break!
Regarding the separate, difficult-to-keep-updated file
-part: I agree that it's not particularly pretty.
However, the file is automatically generated by running scripts/ssm_get_default_params.py
, which runs as a cron-job every week - so keeping it in sync is just a matter of merging an automated PR every week.
botocore already provides a method for overriding the default endpoints.json file that ships with the package. You can drop your own JSON file (with whatever regions you want to add/customize) into ~.aws/models
I didn't know this! However, if get_available_regions
is deprecated, I assume that endpoints.json
will also be deprecated/removed? Considering there is no region-based validation at runtime, maybe that data is only used to power get_available_regions
.
@bblommers I've got to this PR by chance and I want to show our method (using the /aws/service/global-infrastructure/services
SSM parameter) to do just this in Prowler:
- First, with a GHA we run this script https://github.com/prowler-cloud/prowler/blob/master/util/update_aws_services_regions.py on a daily basis.
- Then, the above script generates the following file with all the available regions per AWS service and partition https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/aws_regions_by_service.json
- Finally, we iterate over that list to get the regions for the services we want to audit https://github.com/prowler-cloud/prowler/blob/590a5669d6e7856e087d7f2c5d53514d95b34e03/prowler/providers/aws/aws_provider.py#L178
If you have more questions around that I can help you.
Thanks for sharing @jfagoagas! That makes a lot of sense, to keep everything in one file. I'll keep that in mind as a possible alternative.