No Notifications for Insufficient AWS Permissions in Taskcluster Worker Pool Creation
Describe the bug When I create a new worker pool in Taskcluster, it does not show any error or notification indicating whether the configuration was set up properly. For instance, if the AWS provider IAM role does not have enough permissions to create EC2 instances, the UI does not indicate whether everything is okay or if it failed. As a result, I had to go to the AWS console to look for created instances. There were no instances, so I checked CloudTrail and found the problem. It would be easier to see this kind of configuration error directly instead of looking in AWS and guessing what happened.
To Reproduce
Steps to reproduce the behavior:
- Use an AWS user with insufficient permissions and a provider that uses that user.
- Go to 'Worker pools'.
- Click on 'Create worker pool'.
- Fill in all required fields and provide the launch configuration.
- Choose a provider with insufficient permissions.
- Click save
Expected behavior There should be an error notification if the configuration fails or a success notification if it is set up correctly.
Taskcluster version v64.2.8
Platform (please complete the following information):
OS: MacOS Browser: Brave
When worker pool is being created, there are no validations afaik (besides the schema and provider).
I'm not sure if it is possible to test every resource mentioned in the launch configurations if it is accessible or not, before the instances are actually being requested by worker manager. And probably such validation at creation time, wouldn't help much, as resources/permissions could be changed right after, so what was valid when the pool was defined, might become invalid later.
In current implementation, the error should only appear after the instances were attempted to be launched by worker-manager, and if it fails, it should send a notification to the worker pool owner email, plus leave some messages in Errors (like this page: /worker-manager/errors).
Can you please confirm that there were also no errors visible on that page? Maybe there are some places in code where errors are silenced.
If you create a task for the worker pool immediately after creating the worker pool, does the error which worker manager hits propagate to the taskcluster UI? It should show up like this on the worker pool page: