cartography icon indicating copy to clipboard operation
cartography copied to clipboard

AWS Resource Groups Tagging API sync has O(n²) inefficiency - should batch resource types per region

Open achantavy opened this issue 6 months ago • 0 comments

AWS Resource Groups Tagging API sync has O(n²) inefficiency - should batch resource types per region

Description

The current AWS Resource Groups Tagging API sync implementation has kind of an O(n²) performance issue. For each region, it makes a separate API call for each resource type, resulting in N×M API calls (N regions × M resource types). This is inefficient because the AWS Resource Groups Tagging API supports querying multiple resource types in a single call using the ResourceTypeFilters parameter as an array.

Expected behavior: Cartography should make only N API calls (one per region) by batching all resource types into a single get_resources call per region, reducing API calls by ~20x (given ~20 resource types in TAG_RESOURCE_TYPE_MAPPINGS).

To Reproduce

  1. Run cartography with AWS Resource Groups Tagging API sync enabled across multiple regions
  2. Monitor the API calls being made to the resourcegroupstaggingapi service
  3. Observe that for each region, separate API calls are made for each resource type in TAG_RESOURCE_TYPE_MAPPINGS

Current behavior in code:

# In sync() function - lines 284-301
for region in regions:
    for resource_type in tag_resource_type_mappings.keys():  # ~20 resource types
        tag_data = get_tags(boto3_session, resource_type, region)  # Separate API call each time

Logs

Example log output showing multiple API calls per region:

INFO Syncing AWS tags for account 123456789012 and region us-east-1
INFO Loading 5 tags for resource type autoscaling:autoScalingGroup
INFO Loading 12 tags for resource type dynamodb:table
INFO Loading 8 tags for resource type ec2:instance
INFO Loading 3 tags for resource type ec2:security-group

.. 16 more separate API calls for this region
INFO Syncing AWS tags for account 123456789012 and region us-west-2


... another 20 API calls for the next region

Screenshots

N/A

Environment

  • Cartography release version or commit hash: 0.104.0

Additional context

The AWS Resource Groups Tagging API get_resources operation accepts ResourceTypeFilters as an array supporting up to 100 resource types per call. The current implementation in get_tags() function (line 155-182) only passes a single resource type.

Proposed solution

  1. Modify get_tags() to accept a list of resource types instead of a single resource type
  2. Update the sync loop to make one batched call per region instead of one call per resource type per region
  3. Process the returned results to separate by resource type for the existing transform_tags() and load_tags() functions

Benefits

  • Reduces API calls from O(regions × resource_types) to O(regions), resulting in faster sync times

AWS API Documentation Reference

GetResources API - ResourceTypeFilters parameter supports arrays with up to 100 items.

achantavy avatar Jun 09 '25 18:06 achantavy