AWS Resource Groups Tagging API sync has O(n²) inefficiency - should batch resource types per region
AWS Resource Groups Tagging API sync has O(n²) inefficiency - should batch resource types per region
Description
The current AWS Resource Groups Tagging API sync implementation has kind of an O(n²) performance issue. For each region, it makes a separate API call for each resource type, resulting in N×M API calls (N regions × M resource types). This is inefficient because the AWS Resource Groups Tagging API supports querying multiple resource types in a single call using the ResourceTypeFilters parameter as an array.
Expected behavior: Cartography should make only N API calls (one per region) by batching all resource types into a single get_resources call per region, reducing API calls by ~20x (given ~20 resource types in TAG_RESOURCE_TYPE_MAPPINGS).
To Reproduce
- Run cartography with AWS Resource Groups Tagging API sync enabled across multiple regions
- Monitor the API calls being made to the
resourcegroupstaggingapiservice - Observe that for each region, separate API calls are made for each resource type in
TAG_RESOURCE_TYPE_MAPPINGS
Current behavior in code:
# In sync() function - lines 284-301
for region in regions:
for resource_type in tag_resource_type_mappings.keys(): # ~20 resource types
tag_data = get_tags(boto3_session, resource_type, region) # Separate API call each time
Logs
Example log output showing multiple API calls per region:
INFO Syncing AWS tags for account 123456789012 and region us-east-1
INFO Loading 5 tags for resource type autoscaling:autoScalingGroup
INFO Loading 12 tags for resource type dynamodb:table
INFO Loading 8 tags for resource type ec2:instance
INFO Loading 3 tags for resource type ec2:security-group
.. 16 more separate API calls for this region
INFO Syncing AWS tags for account 123456789012 and region us-west-2
... another 20 API calls for the next region
Screenshots
N/A
Environment
- Cartography release version or commit hash: 0.104.0
Additional context
The AWS Resource Groups Tagging API get_resources operation accepts ResourceTypeFilters as an array supporting up to 100 resource types per call. The current implementation in get_tags() function (line 155-182) only passes a single resource type.
Proposed solution
- Modify
get_tags()to accept a list of resource types instead of a single resource type - Update the sync loop to make one batched call per region instead of one call per resource type per region
- Process the returned results to separate by resource type for the existing
transform_tags()andload_tags()functions
Benefits
- Reduces API calls from O(regions × resource_types) to O(regions), resulting in faster sync times
AWS API Documentation Reference
GetResources API - ResourceTypeFilters parameter supports arrays with up to 100 items.