instance-scheduler-on-aws Error: Boto3 InvalidShapeError

I was getting this error in v1.3 and was unable to find the cause as the error states its in relation to a missing key for region however the region was either being passed or had been set by default. At the time I ended up ignoring this error with a small modification to the code.

However after upgrading to v1.4 recently, I am now experiencing this error once again as it reads the code from aws s3 bucket. Our InstanceScheduler has been deployed across 70+ accounts and we have 5 schedules created via the Cloudformation template. The error itself can occur so far has occurred in two different accounts and currently is triggering a couple times a week generally early in the morning. The error occurs irregularly, some days it wont appear then suddenly its back again. It has been difficult to reproduce.

Loggroup: InstanceScheduler-logs
Logstream InstanceScheduler-20210531
Error : ERROR   : Error handling request 
{"action": "scheduler:run", "configuration": 
{"tag_name": "Schedule", "default_timezone": "Australia/Adelaide", "trace": false, "enable_SSM_maintenance_windows": false, 
"use_metrics": false, "schedule_clusters": true, "create_rds_snapshot": false, "schedule_lambda_account": false, "started_tags": 
"StartedByInstanceScheduler=true", "stopped_tags": "StoppedByInstanceScheduler=true", "regions": ["ap-southeast-2"], "cross_account_roles": 
["arn:aws:iam::<account-id>:role/cron"], "scheduled_services": ["rds"], "schedules": {"<schedulerrule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide", 
"stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false,
"schedule_dt": "2021-06-01T07:45:50.041048+09:30", "periods": ["<schedulerrule>-period-0001"]}, "<scheduler-rule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide",
"stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false,
"schedule_dt":"2021-06-01T07:45:50.041362+09:30", "periods": ["<scheduler-rule>-period-0001"]}, "InstanceScheduler-OutOfBusinessHoursSchedule":
{"name": "InstanceScheduler-OutOfBusinessHoursSchedule", "timezone": "Australia/Adelaide", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false,
"use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.041655+09:30", "periods":
["InstanceScheduler-OutOfBusinessHoursSchedule-period-0001"]}, "<scheduler-rule>": {"name": "<scheduler-rule>", "timezone": "Australia/Adelaide", "stop_new_instances": true,
"use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.041927+09:30",
"periods": ["<scheduler-rule>-period-0001"]}, "InstanceScheduler-StopButDoesNotStartInstanceSchedule": {"name": "InstanceScheduler-StopButDoesNotStartInstanceSchedule",
"timezone": "Australia/Adelaide", "stop_new_instances": false, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false,
"retain_running": false, "schedule_dt": "2021-06-01T07:45:50.042149+09:30", "periods": ["InstanceScheduler-StopButDoesNotStartInstanceSchedule-period-0001"]},
"InstanceScheduler-TestSchedule": {"name": "InstanceScheduler-TestSchedule", "timezone": "Australia/Adelaide", "stop_new_instances": false, "use_metrics": false,
"enforced": false, "hibernate": false, "use_maintenance_window": false, "retain_running": false, "schedule_dt": "2021-06-01T07:45:50.042404+09:30",
"periods": ["InstanceScheduler-TestSchedule-period-0001"]}, "running": {"name": "running", "timezone": "Australia/Adelaide", "override_status":
"running", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false,
"schedule_dt": "2021-06-01T07:45:50.042427+09:30"}, "scale-up-down": {"name": "scale-up-down", "timezone": "UTC", "stop_new_instances": true, "use_metrics": false,
"enforced": false, "hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-05-31T22:15:50.042847+00:00", "periods": ["[email protected]", "[email protected]"]},
"seattle-office-hours": {"name": "seattle-office-hours", "timezone": "US/Pacific", "stop_new_instances": true, "use_metrics": false, "enforced": false, "hibernate": false,
"use_maintenance_window": false, "schedule_dt": "2021-05-31T15:15:50.045086-07:00", "periods": ["office-hours"]},

7:46
"stopped": {"name": "stopped", "timezone": "Australia/Adelaide", "override_status": "stopped", "stop_new_instances": true,
"use_metrics": false, "enforced": false, "hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-06-01T07:45:50.045115+09:30"},
"uk-office-hours": {"name": "uk-office-hours", "timezone": "Europe/London", "stop_new_instances": true, "use_metrics": false, "enforced": false,
"hibernate": false, "use_maintenance_window": false, "schedule_dt": "2021-05-31T23:15:50.047048+01:00", "periods": ["office-hours"]}}, "periods":
{"<schedulerrule>-period-0001": {"begintime": "05:30", "endtime": "19:00", "weekdays": [0, 1, 2, 3, 4]}, "<scheduler-rule>-period-0001":
{"begintime": "06:45", "endtime": "18:15", "weekdays": [0, 1, 2, 3, 4]}, "InstanceScheduler-OutOfBusinessHoursSchedule-period-0001":
{"begintime": "07:00", "endtime": "18:00", "weekdays": [0, 1, 2, 3, 4]}, "<scheduler-rule>-period-0001": {"begintime": "07:00", "endtime": "19:00",
"weekdays": [0, 1, 2, 3, 4, 5, 6]}, "InstanceScheduler-StopButDoesNotStartInstanceSchedule-period-0001": {"endtime": "18:00"},
"InstanceScheduler-TestSchedule-period-0001": {"begintime": "14:00", "endtime": "14:15", "weekdays": [0, 1, 2, 3, 4]},
"working-days":{"weekdays": [0, 1, 2, 3, 4]}, "weekends": {"weekdays": [5, 6]}, "office-hours": {"begintime": "09:00", "endtime": "17:00", "weekdays": [0, 1, 2, 3, 4]}}},
"dispatch_time": "2021-05-31 22:15:52.364941"} by handler SchedulerRequestHandler:
(Shape is missing required key 'type': OrderedDict([('members', OrderedDict([('SourceRegion', OrderedDict([('shape', 'String'),
('documentation', '<p>The ID of the region that contains the source for the db instance.</p>')]))]))]))
Traceback (most recent call last):
 File "/var/runtime/botocore/model.py", line 608, in get_shape_by_name
   shape_cls = self.SHAPE_CLASSES.get(shape_model['type'], Shape)
KeyError: 'type'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "/var/task/main.py", line 71, in lambda_handler
   result = handler.handle_request()
 File "/var/task/requesthandlers/scheduler_request_handler.py", line 133, in handle_request
   lambda_account=self.lambda_account, context=self._context, logger=self._logger)
 File "/var/task/schedulers/instance_scheduler.py", line 220, in run
   response[account.name] = self._process_account(account)
 File "/var/task/schedulers/instance_scheduler.py", line 267, in _process_account
   for instance in self._scheduled_instances_in_region(account, region):
 File "/var/task/schedulers/instance_scheduler.py", line 184, in _scheduled_instances_in_region
   schedulers.PARAM_CONTEXT: self._context
 File "/var/task/schedulers/rds_service.py", line 277, in get_schedulable_instances
   instances = self.get_schedulable_rds_instances(kwargs)
 File "/var/task/schedulers/rds_service.py", line 253, in get_schedulable_rds_instances
   kwargs=kwargs)
 File "/var/task/schedulers/rds_service.py", line 199, in get_schedulable_resources
   rds_resp = fn(**describe_arguments)
 File "/var/task/boto_retry/__init__.py", line 78, in wrapped_api_method
   return retry_strategy.call(client_or_resource, name, args)
 File "/var/task/boto_retry/aws_service_retry.py", line 136, in call
   raise ex
 File "/var/task/boto_retry/aws_service_retry.py", line 118, in call
   resp = method(**call_arguments)
 File "/var/runtime/botocore/client.py", line 357, in _api_call
   return self._make_api_call(operation_name, kwargs)
 File "/var/runtime/botocore/client.py", line 663, in _make_api_call
   operation_model, request_dict, request_context)
 File "/var/runtime/botocore/client.py", line 682, in _make_request
   return self._endpoint.make_request(operation_model, request_dict)
 File "/var/runtime/botocore/endpoint.py", line 102, in make_request
7:46
return self._send_request(request_dict, operation_model)
 File "/var/runtime/botocore/endpoint.py", line 135, in _send_request
   request, operation_model, context)
 File "/var/runtime/botocore/endpoint.py", line 167, in _get_response
   request, operation_model)
 File "/var/runtime/botocore/endpoint.py", line 227, in _do_get_response
   operation_model, parser,
 File "/var/runtime/botocore/endpoint.py", line 240, in _add_modeled_error_fields
   error_shape = service_model.shape_for_error_code(error_code)
 File "/var/runtime/botocore/model.py", line 273, in shape_for_error_code
   return self._error_code_cache.get(error_code, None)
 File "/var/runtime/botocore/utils.py", line 878, in __get__
   computed_value = self._fget(obj)
 File "/var/runtime/botocore/model.py", line 278, in _error_code_cache
   for error_shape in self.error_shapes:
 File "/var/runtime/botocore/utils.py", line 878, in __get__
   computed_value = self._fget(obj)
 File "/var/runtime/botocore/model.py", line 294, in error_shapes
   error_shape = self.shape_for(shape_name)
 File "/var/runtime/botocore/model.py", line 270, in shape_for
   shape_name, member_traits)
 File "/var/runtime/botocore/model.py", line 611, in get_shape_by_name
   % shape_model)
botocore.model.InvalidShapeError: Shape is missing required key 'type':
OrderedDict([('members', OrderedDict([('SourceRegion', OrderedDict([('shape', 'String'),
('documentation', '<p>The ID of the region that contains the source for the db instance.</p>')]))]))])

I could ignore this error again however that doesn't fix the underlying issue as to why this is occurring.

Jun 01 '21 02:06 dbrenecki

Hi @dbrenecki

Can you confirm if the solutions is being used to schedule RDS Aurora Clusters?, If yes can you confirm if the RDS instances within a cluster are not tagged with the tag Schedule=[SCHEDULE_NAME]?

This error is similar to #240.

Jun 01 '21 02:06 gockle

Yes enabled for RDS Aurora Clusters, however I dont believe we have any active auroras under the InstanceScheduler right now. Also I'd note that the account id that was referenced in the error has no RDS or EC2 instances at all in that AWS account.

On the previous occasion I got the error, the account that was referenced had scheduled instances with all the tags except for one that was missing StoppedByInstanceScheduler=true which I've gone ahead and added the tag on but don't believe that would be the issue if its also occurring when no instance scheduled resources exist in an account.

The issue you linked seems to be the same issue im facing however I haven't noticed any resources tagged that are not stop/starting. Also in my cloudformation template the InstanceScheduler lambda has

S3Key: aws-instance-scheduler/v1.4.0/instance-scheduler.zip
S3Bucket: solutions-ap-southeast-2

Jun 01 '21 03:06 dbrenecki

Hi @dbrenecki If no RDS Instance in the clusters have been tagged and this error is being logged, it could be a different issue and not the same as #240. I will try to recreate this issue.

Jun 01 '21 13:06 gockle

Since upgrading to v1.4 2 weeks ago. The error has only occurred twice, both in the last few days and both different accounts. This morning it didn't occur. I wasn't able to find a pattern last time or this time but the error generally occurs on the 15min mark which is the period we use to detect if a resource is wrongly tagged. It also occurs around 3am local time which is not near any of our turn on/off times.

So if I was to guess, the error may be related to the cron process part of InstanceScheduler for wrongly tagged resources. E.g. Key: "Schedule",Value: "Test1" where "Test1" is not an existing schedule we have defined in our DynamoDB table so we generally would get an alert for this every 15min until its fixed.

Jun 01 '21 23:06 dbrenecki

Hi @dbrenecki Have you tried enabling the detailed logging for the solution, i.e. "Enable CloudWatch Logs" is set to "Yes" for the stack? This will provide more detailed logging, and will be helpful to get more details.

If there is a schedule missing in the Dynamo DB table, the solution will log the following message WARNING : Skipping instance RDS:[SCHEDULE-NAME] in region ap-southeast-2 for account xxxxxxxxxxxx, schedule name "[SCHEDULE-NAME]" is unknown
The solution will only the throw the error, only when there is a tag on an instance within the cluster.
In the file main.py there is a method load_models(), this is loading a different version of the boto API's to be used by the Lambda instance, if this line is commented/removed, the boto API's used by the Lambda instance are the ones available by default, The error is no longer displayed in the logs if this change is made. We will include this change in our next release.

Jun 03 '21 16:06 gockle

I've enabled detailed logging now. Just waiting to get the error again, hasn't occurred since last week.

Jun 06 '21 23:06 dbrenecki

Hello, Is there any update on this issue ? I have the same error code and none of the proposed solutions worked.

Sep 20 '22 09:09 CoudPelle

instance-scheduler-on-aws instance-scheduler-on-aws copied to clipboard

Error: Boto3 InvalidShapeError

instance-scheduler-on-aws
instance-scheduler-on-aws copied to clipboard