dask-cloudprovider icon indicating copy to clipboard operation
dask-cloudprovider copied to clipboard

Failure to remove log results in ResourceAlreadyExistsException

Open NicWayand opened this issue 4 years ago • 3 comments

When trying to spin up a new Fargate cluster following @rsignell-usgs 's example, I have to manually delete the log group in cloudwatch, or I get an error message:

ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateLogGroup operation: The specified log group already exists

My expected behavior is that it will clean up the log group.

Do I need to close out the cluster for this to happen?

NicWayand avatar Feb 28 '20 20:02 NicWayand

I think @jacobtomlinson told me there is a bug there. As a workaround you can add skip_cleanup=True, like:

cluster = FargateCluster(n_workers=1, image='rsignell/pangeo-worker:2020-01-23c',
                         skip_cleanup=True)

rsignell-usgs avatar Feb 28 '20 21:02 rsignell-usgs

Thanks @rsignell-usgs, but I still get the ResourceAlreadyExistsException even with the skip clean up as True. (I tried a few times).

NicWayand avatar Feb 28 '20 22:02 NicWayand

It sounds like there are two issues here:

  • If the log group exists already an exception is raised. We should catch this and continue.
  • The log group is not being correctly cleaned up on shutdown.

A workaround in the meantime would be to set the cloudwatch_logs_group kwarg to the name of your existing log group.

cluster = FargateCluster(n_workers=1, cloudwatch_logs_group="my_log_group")

jacobtomlinson avatar Mar 16 '20 15:03 jacobtomlinson