foreman_maintain
foreman_maintain copied to clipboard
Fixes #36056 - Don't enable maintenance mode during offline backup
Remove maintenance mode enabling when doing offline or snapshot backups
Issues: #36056
I assume that maintenance mode was put in place in the procedure as a safety mechanism in case service stop, or start only completes partially and prevents the application from being in a partially up, inconsistent state. This has been in place since the original backup procedure was created.
What is the problem you are trying to solve?
I assume that maintenance mode was put in place in the procedure as a safety mechanism in case service stop, or start only completes partially and prevents the application from being in a partially up, inconsistent state. This has been in place since the original backup procedure was created. What is the problem you are trying to solve?
The problem I'm trying to solve is having the offline backup routing less error-prone. Random errors while enabling or disabling sync-plans happen (for multiple reasons). When this happens at the moment someone is taking a backup it may:
- Simply abort the backup procedure (if it happens when entering maintenance mode), leaving the user without an expected backup
- Leave the sync-plan disabled after the backup is done (if it happens when enabling it)
I understand that it is in place since forever and it has been a while since I've been dealing with customers complaining about issues that are consequences of sync-plans failing to be enabled and/or disabled during this procedure. The usual workaround is to whiltelist sync-plans-enable and sync-plans-disable.
Just last week I worked with 2 different Satellite customers that were hit with it. You can have a look at this bz https://bugzilla.redhat.com/show_bug.cgi?id=2142673
There are more cases, I'm pretty sure. The ones there are just the more recent and/or that I recalled and linked there.
I'd agree with having maintenance-mode enabled/disabled.. but then could we do a maintenance-mode without touching sync-plans for the backup routine?
Do we know why sync plan disable and enable fails? Is it a known singular issues or a class of issues?
It's not a singular issue. An example I saw was the request taking too long to respond to disable, then foreman-maintain aborted with a timeout. No backup was taken, but the request to disable sync plans eventually finished (like 1 minute after foreman-maintain gave up). Customer ended with no backup for that day and the sync-plan disabled.
Other case I remember was hitting this issue: https://access.redhat.com/solutions/5694161
I'm all for fixing the underlying issues.. but my point is that the reward we get disabling and enabling sync-plans for an offline backup is not worth it.
Enable / disable are built into maintenance mode primarily for upgrade? Rather than solving a general purpose problem?
Took me a while to get back to this, sorry about that.
I don't know about the reasoning for having the sync-plans built into maintenance mode, but I agree it do make sense to have sync-plans disabled during upgrades. I don't want this to change.
What I want is to not enable maintenance-mode during offline backup, since all services are stopped during the process. Therefore, we don't need to disable a sync-plan or reject connections using firewall rules. Services will be stopped an requests from clients will already be reject for lack of a service listening on the port. Sync-plans won't run because the service that could trigger them is also stopped.
@evgeni As you are deep re-factoring backup/restore, what are your thoughts on if we should do this or close it?
I was looking at this the other day and I think it makes sense
One rather interesting thing: I think today we disable sync plans before the offline backup, but never enable them after the restore :see_no_evil: (partially, because we do not back up the foreman maintain data which contains the list of disabled plans)
[root@centos8-stream-katello-nightly ~]# hammer sync-plan create --name test --organization "Default Organization" --enabled true --sync-date "2024-06-15" --interval "daily"
Sync plan created.
[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1 | test | 2024/06/15 00:00:00 | daily | yes | | 4
---|------|---------------------|----------|---------|-----------------|-------------------
[root@centos8-stream-katello-nightly ~]# foreman-maintain backup offline /var/tmp/off
**** BACKUP Complete, contents can be found in: /var/tmp/off/katello-backup-2024-06-15-09-28-48 ****
[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1 | test | 2024/06/15 00:00:00 | daily | yes | | 4
---|------|---------------------|----------|---------|-----------------|-------------------
[root@centos8-stream-katello-nightly ~]# foreman-maintain restore /var/tmp/off/katello-backup-2024-06-15-09-28-48/
[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1 | test | 2024/06/15 00:00:00 | daily | no | | 4
---|------|---------------------|----------|---------|-----------------|-------------------