aws-well-architected-labs
aws-well-architected-labs copied to clipboard
Cost 300_Optimization_Data_Collection : Need to add Json classifier for crawler + change rds cloudwatch call
This lab uses crawlers to parse collected data into athena tables.
Crawlers works perfectly for .csv files. For json files it works only when there is only one file or when the file schema is does not change.
The problem is on two types of data extraction as data format can vary based on account configuration :
- rds_metrics
- budgets
for theses two data types, athena is unable to read data due to "HIVE_METASTORE_ERROR" error.
I added a json classifier with "$[*]" path in order to fix this error for rds_metrics.
I still have this issue for the budget table
In addition there is an issue regarding 300_Optimization_Data_Collection/Code/rds_util_template.yaml
boto3 client called is the local client and not the assumed role one. So cloudwatch returned data is empty.
The correct call should be something like that :
for service in functions.keys():
if functions[service]['regional']:
for region in regions:
client = assume_role(account_id, functions[service]['api'], region['RegionName'])
for f in functions[service]['functions']:
#cw_client = boto3.client('cloudwatch', region_name = region['RegionName'])
cw_client = assume_role(account_id, 'cloudwatch', region['RegionName'])
data = globals()[f['name']](cw_client, client, s3client, region['RegionName'], service, f['output_path'], f['output_file_name'], account_id)
else:
client=boto3.client(service)
for f in functions[service]['functions']:
#cw_client = boto3.client('cloudwatch', region_name = 'us-east-1')
cw_client = assume_role(account_id,'cloudwatch', 'us-east-1')
data = globals()[f['name']](cw_client, client, s3client, 'us-east-1', service, f['output_path'], f['output_file_name'], account_id)