pulumi-aws-native icon indicating copy to clipboard operation
pulumi-aws-native copied to clipboard

Cannot delete role when aws built-in policy is attached

Open breathe opened this issue 2 years ago • 5 comments

Hello!

  • Vote on this issue by adding a 👍 reaction
  • To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

error: operation DELETE failed with "GeneralServiceException": Cannot delete entity, must detach all policies first. (Service: Iam, Status Code: 409, Request ID: 919cb57f-8dad-4f24-bf0f-d56b88ce8e33, Extended Request ID: null)

Steps to reproduce

  1. Define a stack the includes declaration of a role named 'test-role'
  2. Attach a built-in role policy to that role (example role policy: [AmazonEC2ContainerRegistryReadOnly])
  3. pulumi up to create 'test-role'
  4. Remove all logic associated with test-role from the codebase for the stack (so that pulumi will want to delete the role)
  5. pulumi up

Expected: The role would be deleted Actual: The operation fails with operation DELETE failed with "GeneralServiceException": Cannot delete entity, must detach all policies first.

breathe avatar Mar 29 '22 17:03 breathe

pulumi about Results:

CLI
Version      3.27.0
Go Version   go1.18
Go Compiler  gc

Plugins
NAME        VERSION
aws         4.36.0
aws-native  0.6.0
datadog     4.3.0
docker      3.1.0
gcp         5.20.0
python      unknown
random      4.3.1
tls         4.0.0
vault       4.6.0

Host
OS       darwin
Version  12.3

tusharshahrs avatar Mar 29 '22 20:03 tusharshahrs

This issue is easy to reproduce with the code below

ecs_instance_role = aws_native.iam.Role(
    f'instance-role',
    assume_role_policy_document=json.dumps({
            "Version": "2012-10-17",
            "Statement": [{
                "Action": "sts:AssumeRole",
                "Effect": "Allow",
                "Sid": "",
                "Principal": {
                    "Service": "ec2.amazonaws.com",
                },
            }],
        }),
    path='/',
    managed_policy_arns=[
        'arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore',
        'arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy',
        'arn:aws:iam::aws:policy/service-role/AWSCodeDeployRole',
    ]
)

The DELETE operation succeeds only after manually removing the 3 managed policy arns from the role.

aureq avatar Oct 20 '22 21:10 aureq

It would be good to understand AWS's rational with this behaviour. Is this a bug on their side or do they recommend explicitly deleting the policies first for another reason?

The workaround is to do an update of the role first to remove the policies before doing a delete.

I've raised a support ticket with AWS internally to get clarification on if this is indended behaviour or an issue (ticket: 11194451371)

danielrbradley avatar Nov 04 '22 09:11 danielrbradley

Response from AWS:

To answer your query directly, the behaviour you are seeing when deleting the role from the CLI or API or using CloudControl API is expected and it is not any bug. Whenever you want to delete the role using the CLI or API you must need to delete the inline policies attached to the roles first and then you need to delete the role otherwise you will see the same error.

However, when you are trying to delete a role attached with inline policies in console then it is not required to delete the policies explicitly, DeleteRole API itself deletes all the attached policies as well. But in the CLI or API you must need to delete it manually refer to [1] in reference section.

This is because when you try to delete the role from the console and while deleting the role you could able to see all the policies attached to the role however that is not possible when you are doing from the CLI or API. Hence to identify or to see the policies which are attached to the role before deleting it we must need to manually check and delete those policies before you delete the role.

There is no additional risk or something to restrict the deletion of the role without deleting the attached polices for the above reason when using the CLI or API.

To conclude, the behaviour you are seeing is expected and it is not any bug in the service. When you delete the role from the console all the attached policies will also gets deleted but when you are doing it from the CLI orAPI you must need to delete the policies manually before deleting the role.

[1] https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage_delete.html#roles-managingrole-deleting-cli

Therefore, this is by design so won't be fixed on the CloudControl side. It appears that we would need to add special-cased behaviour for Role in the form of a pre-delete action.

In the same spirit of safety that the console or the CLI, we might only want to remove inline policies which are known about by Pulumi so that, if policies have been added manually, we won't silently loose them. If this causes a failure, then it could simply be resolved by performing a refresh. I don't believe the inline policy delete can be conditional, so we'd need to perform a read and compare to the known state. This has the risk of race conditions but would suffice as a best-effort safety check.

Side thought: we could apply this approach to all resource deletes–always doing a read before a delete and failing if there's a change. Though this might not always be desired behaviour.

We should also consider if this is an issue which might also impact other resource types and could therefore require a more generic/extensible solution.

danielrbradley avatar Nov 07 '22 14:11 danielrbradley

Side thought: we could apply this approach to all resource deletes–always doing a read before a delete and failing if there's a change. Though this might not always be desired behaviour.

I like that idea!

On the same lines, should there be a sort of "force" or "recurse" available that will just detach or purge anything that's attached to an existing resource?

serverhorror avatar Jan 16 '23 13:01 serverhorror