instance-scheduler-on-aws
instance-scheduler-on-aws copied to clipboard
Error: Boto3 InvalidShapeError
I was getting this error in v1.3 and was unable to find the cause as the error states its in relation to a missing key for region however the region was either being passed or had been set by default. At the time I ended up ignoring this error with a small modification to the code.
However after upgrading to v1.4 recently, I am now experiencing this error once again as it reads the code from aws s3 bucket. Our InstanceScheduler has been deployed across 70+ accounts and we have 5 schedules created via the Cloudformation template. The error itself can occur so far has occurred in two different accounts and currently is triggering a couple times a week generally early in the morning. The error occurs irregularly, some days it wont appear then suddenly its back again. It has been difficult to reproduce.
Loggroup: InstanceScheduler-logs
Logstream InstanceScheduler-20210531
Error : ERROR : Error handling request
{"action": "scheduler:run", "configuration":
{"tag_name": "Schedule", "default_timezone": "Australia/Adelaide", "trace": false, "enable_SSM_maintenance_windows": false,
"use_metrics": false, "schedule_clusters": true, "create_rds_snapshot": false, "schedule_lambda_account": false, "started_tags":
"StartedByInstanceScheduler=true", "stopped_tags": "StoppedByInstanceScheduler=true", "regions": ["ap-southeast-2"], "cross_account_roles":
["arn:aws:iam::<account-id>:role/cron"], "scheduled_services": ["rds"], "schedules": {"<schedulerrule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide",
"stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false,
"schedule_dt": "2021-06-01T07:45:50.041048+09:30", "periods": ["<schedulerrule>-period-0001"]}, "<scheduler-rule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide",
"stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false,
"schedule_dt":"2021-06-01T07:45:50.041362+09:30", "periods": ["<scheduler-rule>-period-0001"]}, "InstanceScheduler-OutOfBusinessHoursSchedule":
{"name": "InstanceScheduler-OutOfBusinessHoursSchedule", "timezone": "Australia/Adelaide", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false,
"use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.041655+09:30", "periods":
["InstanceScheduler-OutOfBusinessHoursSchedule-period-0001"]}, "<scheduler-rule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide", "stop_new_instances": true,
"use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.041927+09:30",
"periods": ["<scheduler-rule>-period-0001"]}, "InstanceScheduler-StopButDoesNotStartInstanceSchedule": {"name": "InstanceScheduler-StopButDoesNotStartInstanceSchedule",
"timezone": "Australia/Adelaide", "stop_new_instances": false, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false,
"retain_running": false, "schedule_dt": "2021-06-01T07:45:50.042149+09:30", "periods": ["InstanceScheduler-StopButDoesNotStartInstanceSchedule-period-0001"]},
"InstanceScheduler-TestSchedule": {"name": "InstanceScheduler-TestSchedule", "timezone": "Australia/Adelaide", "stop_new_instances": false, "use_metrics": false,
"enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.042404+09:30",
"periods": ["InstanceScheduler-TestSchedule-period-0001"]}, "running": {"name": "running", "timezone": "Australia/Adelaide", "override_status":
"running", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false,
"schedule_dt": "2021-06-01T07:45:50.042427+09:30"}, "scale-up-down": {"name": "scale-up-down", "timezone": "UTC", "stop_new_instances": true, "use_metrics": false,
"enforced": false, "hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-05-31T22:15:50.042847+00:00", "periods": ["[email protected]", "[email protected]"]},
"seattle-office-hours": {"name": "seattle-office-hours", "timezone": "US/Pacific", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false,
"use_maintenance_window": false, "schedule_dt": "2021-05-31T15:15:50.045086-07:00", "periods": ["office-hours"]},
7:46
"stopped": {"name": "stopped", "timezone": "Australia/Adelaide", "override_status": "stopped", "stop_new_instances": true,
"use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-06-01T07:45:50.045115+09:30"},
"uk-office-hours": {"name": "uk-office-hours", "timezone": "Europe/London", "stop_new_instances": true, "use_metrics": false, "enforced": false,
"hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-05-31T23:15:50.047048+01:00", "periods": ["office-hours"]}}, "periods":
{"<schedulerrule>-period-0001": {"begintime": "05:30", "endtime": "19:00", "weekdays": [0, 1, 2, 3, 4]}, "<scheduler-rule>-period-0001":
{"begintime": "06:45", "endtime": "18:15", "weekdays": [0, 1, 2, 3, 4]}, "InstanceScheduler-OutOfBusinessHoursSchedule-period-0001":
{"begintime": "07:00", "endtime": "18:00", "weekdays": [0, 1, 2, 3, 4]}, "<scheduler-rule>-period-0001": {"begintime": "07:00", "endtime": "19:00",
"weekdays": [0, 1, 2, 3, 4, 5, 6]}, "InstanceScheduler-StopButDoesNotStartInstanceSchedule-period-0001": {"endtime": "18:00"},
"InstanceScheduler-TestSchedule-period-0001": {"begintime": "14:00", "endtime": "14:15", "weekdays": [0, 1, 2, 3, 4]},
"working-days":{"weekdays": [0, 1, 2, 3, 4]}, "weekends": {"weekdays": [5, 6]}, "office-hours": {"begintime": "09:00", "endtime": "17:00", "weekdays": [0, 1, 2, 3, 4]}}},
"dispatch_time": "2021-05-31 22:15:52.364941"} by handler SchedulerRequestHandler:
(Shape is missing required key 'type': OrderedDict([('members', OrderedDict([('SourceRegion', OrderedDict([('shape', 'String'),
('documentation', '<p>The ID of the region that contains the source for the db instance.</p>')]))]))]))
Traceback (most recent call last):
File "/var/runtime/botocore/model.py", line 608, in get_shape_by_name
shape_cls = self.SHAPE_CLASSES.get(shape_model['type'], Shape)
KeyError: 'type'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/task/main.py", line 71, in lambda_handler
result = handler.handle_request()
File "/var/task/requesthandlers/scheduler_request_handler.py", line 133, in handle_request
lambda_account=self.lambda_account, context=self._context, logger=self._logger)
File "/var/task/schedulers/instance_scheduler.py", line 220, in run
response[account.name] = self._process_account(account)
File "/var/task/schedulers/instance_scheduler.py", line 267, in _process_account
for instance in self._scheduled_instances_in_region(account, region):
File "/var/task/schedulers/instance_scheduler.py", line 184, in _scheduled_instances_in_region
schedulers.PARAM_CONTEXT: self._context
File "/var/task/schedulers/rds_service.py", line 277, in get_schedulable_instances
instances = self.get_schedulable_rds_instances(kwargs)
File "/var/task/schedulers/rds_service.py", line 253, in get_schedulable_rds_instances
kwargs=kwargs)
File "/var/task/schedulers/rds_service.py", line 199, in get_schedulable_resources
rds_resp = fn(**describe_arguments)
File "/var/task/boto_retry/__init__.py", line 78, in wrapped_api_method
return retry_strategy.call(client_or_resource, name, args)
File "/var/task/boto_retry/aws_service_retry.py", line 136, in call
raise ex
File "/var/task/boto_retry/aws_service_retry.py", line 118, in call
resp = method(**call_arguments)
File "/var/runtime/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 663, in _make_api_call
operation_model, request_dict, request_context)
File "/var/runtime/botocore/client.py", line 682, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/var/runtime/botocore/endpoint.py", line 102, in make_request
7:46
return self._send_request(request_dict, operation_model)
File "/var/runtime/botocore/endpoint.py", line 135, in _send_request
request, operation_model, context)
File "/var/runtime/botocore/endpoint.py", line 167, in _get_response
request, operation_model)
File "/var/runtime/botocore/endpoint.py", line 227, in _do_get_response
operation_model, parser,
File "/var/runtime/botocore/endpoint.py", line 240, in _add_modeled_error_fields
error_shape = service_model.shape_for_error_code(error_code)
File "/var/runtime/botocore/model.py", line 273, in shape_for_error_code
return self._error_code_cache.get(error_code, None)
File "/var/runtime/botocore/utils.py", line 878, in __get__
computed_value = self._fget(obj)
File "/var/runtime/botocore/model.py", line 278, in _error_code_cache
for error_shape in self.error_shapes:
File "/var/runtime/botocore/utils.py", line 878, in __get__
computed_value = self._fget(obj)
File "/var/runtime/botocore/model.py", line 294, in error_shapes
error_shape = self.shape_for(shape_name)
File "/var/runtime/botocore/model.py", line 270, in shape_for
shape_name, member_traits)
File "/var/runtime/botocore/model.py", line 611, in get_shape_by_name
% shape_model)
botocore.model.InvalidShapeError: Shape is missing required key 'type':
OrderedDict([('members', OrderedDict([('SourceRegion', OrderedDict([('shape', 'String'),
('documentation', '<p>The ID of the region that contains the source for the db instance.</p>')]))]))])
I could ignore this error again however that doesn't fix the underlying issue as to why this is occurring.
Hi @dbrenecki
Can you confirm if the solutions is being used to schedule RDS Aurora Clusters?, If yes can you confirm if the RDS instances within a cluster are not tagged with the tag Schedule=[SCHEDULE_NAME]?
This error is similar to #240.
Yes enabled for RDS Aurora Clusters, however I dont believe we have any active auroras under the InstanceScheduler right now. Also I'd note that the account id that was referenced in the error has no RDS or EC2 instances at all in that AWS account.
On the previous occasion I got the error, the account that was referenced had scheduled instances with all the tags except for one that was missing StoppedByInstanceScheduler=true
which I've gone ahead and added the tag on but don't believe that would be the issue if its also occurring when no instance scheduled resources exist in an account.
The issue you linked seems to be the same issue im facing however I haven't noticed any resources tagged that are not stop/starting. Also in my cloudformation template the InstanceScheduler lambda has
S3Key: aws-instance-scheduler/v1.4.0/instance-scheduler.zip
S3Bucket: solutions-ap-southeast-2
Hi @dbrenecki If no RDS Instance in the clusters have been tagged and this error is being logged, it could be a different issue and not the same as #240. I will try to recreate this issue.
Since upgrading to v1.4 2 weeks ago. The error has only occurred twice, both in the last few days and both different accounts. This morning it didn't occur. I wasn't able to find a pattern last time or this time but the error generally occurs on the 15min mark which is the period we use to detect if a resource is wrongly tagged. It also occurs around 3am local time which is not near any of our turn on/off times.
So if I was to guess, the error may be related to the cron process part of InstanceScheduler for wrongly tagged resources. E.g. Key: "Schedule",Value: "Test1" where "Test1" is not an existing schedule we have defined in our DynamoDB table so we generally would get an alert for this every 15min until its fixed.
Hi @dbrenecki Have you tried enabling the detailed logging for the solution, i.e. "Enable CloudWatch Logs" is set to "Yes" for the stack? This will provide more detailed logging, and will be helpful to get more details.
- If there is a schedule missing in the Dynamo DB table, the solution will log the following message
WARNING : Skipping instance RDS:[SCHEDULE-NAME] in region ap-southeast-2 for account xxxxxxxxxxxx, schedule name "[SCHEDULE-NAME]" is unknown
- The solution will only the throw the error, only when there is a tag on an instance within the cluster.
- In the file main.py there is a method
load_models()
, this is loading a different version of the boto API's to be used by the Lambda instance, if this line is commented/removed, the boto API's used by the Lambda instance are the ones available by default, The error is no longer displayed in the logs if this change is made. We will include this change in our next release.
I've enabled detailed logging now. Just waiting to get the error again, hasn't occurred since last week.
Hello, Is there any update on this issue ? I have the same error code and none of the proposed solutions worked.