data-safe-haven
data-safe-haven copied to clipboard
Error when redeploying PostgreSQL Flexible Server
:white_check_mark: Checklist
- [x] I have searched open and closed issues for duplicates.
- [x] This is a problem observed when deploying a Data Safe Haven.
- [x] I can reproduce this with the latest version.
- [x] I have read through the documentation.
- [x] This isn't an open-ended question (open a discussion if it is).
:computer: System information
- Operating System: macOS
- Data Safe Haven version: develop @ 8e76a2f
:no_entry_sign: Describe the problem
If a PostgreSQL Flexible Server is destroyed (perhaps during SRE teardown) it is impossible to ever create another PostgreSQL Flexible Server with the same name. This means that an SRE cannot be torn down and then redeployed. See this Terraform thread for further discussion.
A possible fix would be to generate a reproducibly-random alphanumeric string and append this to the database name.
:deciduous_tree: Log messages
Relevant log messages
2024-04-16 13:19:00 [ ERROR] azure-native:dbforpostgresql:Server (sre_remote_desktop_db_guacamole_server): cli.py:99
2024-04-16 13:19:00 [ ERROR] error: resource partially created but read failed autorest/azure: Service returned an error. Status=404 Code="ResourceNotFound" Message="The requested resource of type 'Microsoft.DBforPostgreSQL/flexibleServers' cli.py:99
with name 'shm-green-sre-apple-db-server-guacamole' was not found.": Code="ServerGroupDropping" Message="Operations on a server group in dropping state are not allowed."
2024-04-16 13:19:00 [ ERROR]
:recycle: To reproduce
Deploy an SRE, tear it down then redeploy.
Could use a Pulumi random string which persists in the state.
@craddm found this problem 1-2 weeks ago.
Is this still a problem? I haven't seen it recently.
It hasn't seemed to be a big problem lately - last couple of times I've seen it happen it has fixed itself very quickly. Probably okay to close this for now.
I've moved it to the next milestone so we can decide again then!