community icon indicating copy to clipboard operation
community copied to clipboard

RDS Reboot after changing DB Parameter Group

Open dweebo opened this issue 1 year ago • 4 comments

Describe the bug We are using the rds-controller to create instances from snapshots, and are setting a custom db parameter group. Fairly regularly we see the DB parameter group gets stuck in pending-reboot. I ran a test yesterday, 3 out of 10 RDS instances ran into this issue.

Example status:

status:
...
  dbInstanceStatus: available
  conditions:
    - lastTransitionTime: '2023-11-29T15:10:53Z'
      message: Late initialization successful
      reason: Late initialization successful
      status: 'True'
      type: ACK.LateInitialized
    - lastTransitionTime: '2023-11-29T15:10:53Z'
      message: Resource synced successfully
      reason: ''
      status: 'True'
      type: ACK.ResourceSynced
...
  dbParameterGroups:
    - dbParameterGroupName: foo
      parameterApplyStatus: pending-reboot

The operator is no longer-reconciling and the RDS instance doesn't reboot automatically.

I contacted AWS support and part of their response was:

However, I can not see any API for reboot which should be a separate API as Applying changes immediately wont trigger a subsequent reboot as not all modification required a reboot to sync the changes.

Steps to reproduce Create a DBInstance from snapshot (dbSnapshotIdentifier) and with a custom dbParameterGroupName. Wait for RDS instance to be available and for the db parameter group to be set/applied. Check the DBInstance.status.dbParameterGroups.parameterApplyStatus, as well as the status on AWS side.

Expected outcome Since the intent of ACK is to be declarative, I would expect by saying I want to use a custom dbParameterGroup the operator should make sure that is applied and the RDS instance should be restarted if required.

Environment

  • Kubernetes version 1.25
  • Using EKS (yes/no), if so version? No
  • AWS service targeted (S3, RDS, etc.) RDS

dweebo avatar Nov 30 '23 18:11 dweebo

The operator is no longer-reconciling and the RDS instance doesn't reboot automatically.

The rds-controller is not supposed to reboot DBInstances, since this more of a "data-plane" operation. We can definetly implement something if it makes sense here. Would you be available to join on of our community meetings to chat about your specific use case? https://github.com/aws-controllers-k8s/community#community-meeting

a-hilaly avatar Jan 19 '24 20:01 a-hilaly

Just to add some context as I stumbled with this recently. RDS allows for the change to be reflected in the parameter group without being fully applied for static parameters. In this case, it lets the user execute the reboot operation when desired. It doesn't seem like tying the reboot operation as a responsibility of ACK would be a good idea, as it goes against the model RDS follows. More info: https://repost.aws/knowledge-center/rds-parameter-group-update-issues

pcolazurdo avatar Jan 24 '24 10:01 pcolazurdo

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

ack-bot avatar Jul 22 '24 14:07 ack-bot