foreman_maintain icon indicating copy to clipboard operation
foreman_maintain copied to clipboard

Fixes #36056 - Don't enable maintenance mode during offline backup

Open jpasqualetto opened this issue 2 years ago • 7 comments

Remove maintenance mode enabling when doing offline or snapshot backups

jpasqualetto avatar Feb 08 '23 17:02 jpasqualetto

Issues: #36056

theforeman-bot avatar Feb 08 '23 17:02 theforeman-bot

I assume that maintenance mode was put in place in the procedure as a safety mechanism in case service stop, or start only completes partially and prevents the application from being in a partially up, inconsistent state. This has been in place since the original backup procedure was created.

What is the problem you are trying to solve?

ehelms avatar Feb 16 '23 02:02 ehelms

I assume that maintenance mode was put in place in the procedure as a safety mechanism in case service stop, or start only completes partially and prevents the application from being in a partially up, inconsistent state. This has been in place since the original backup procedure was created. What is the problem you are trying to solve?

The problem I'm trying to solve is having the offline backup routing less error-prone. Random errors while enabling or disabling sync-plans happen (for multiple reasons). When this happens at the moment someone is taking a backup it may:

  1. Simply abort the backup procedure (if it happens when entering maintenance mode), leaving the user without an expected backup
  2. Leave the sync-plan disabled after the backup is done (if it happens when enabling it)

I understand that it is in place since forever and it has been a while since I've been dealing with customers complaining about issues that are consequences of sync-plans failing to be enabled and/or disabled during this procedure. The usual workaround is to whiltelist sync-plans-enable and sync-plans-disable.

Just last week I worked with 2 different Satellite customers that were hit with it. You can have a look at this bz https://bugzilla.redhat.com/show_bug.cgi?id=2142673

There are more cases, I'm pretty sure. The ones there are just the more recent and/or that I recalled and linked there.

I'd agree with having maintenance-mode enabled/disabled.. but then could we do a maintenance-mode without touching sync-plans for the backup routine?

jpasqualetto avatar Feb 16 '23 13:02 jpasqualetto

Do we know why sync plan disable and enable fails? Is it a known singular issues or a class of issues?

ehelms avatar Feb 16 '23 16:02 ehelms

It's not a singular issue. An example I saw was the request taking too long to respond to disable, then foreman-maintain aborted with a timeout. No backup was taken, but the request to disable sync plans eventually finished (like 1 minute after foreman-maintain gave up). Customer ended with no backup for that day and the sync-plan disabled.

Other case I remember was hitting this issue: https://access.redhat.com/solutions/5694161

I'm all for fixing the underlying issues.. but my point is that the reward we get disabling and enabling sync-plans for an offline backup is not worth it.

jpasqualetto avatar Feb 16 '23 16:02 jpasqualetto

Enable / disable are built into maintenance mode primarily for upgrade? Rather than solving a general purpose problem?

ehelms avatar Feb 16 '23 16:02 ehelms

Took me a while to get back to this, sorry about that.

I don't know about the reasoning for having the sync-plans built into maintenance mode, but I agree it do make sense to have sync-plans disabled during upgrades. I don't want this to change.

What I want is to not enable maintenance-mode during offline backup, since all services are stopped during the process. Therefore, we don't need to disable a sync-plan or reject connections using firewall rules. Services will be stopped an requests from clients will already be reject for lack of a service listening on the port. Sync-plans won't run because the service that could trigger them is also stopped.

jpasqualetto avatar Aug 02 '23 20:08 jpasqualetto

@evgeni As you are deep re-factoring backup/restore, what are your thoughts on if we should do this or close it?

ehelms avatar Jun 13 '24 18:06 ehelms

I was looking at this the other day and I think it makes sense

evgeni avatar Jun 13 '24 18:06 evgeni

One rather interesting thing: I think today we disable sync plans before the offline backup, but never enable them after the restore :see_no_evil: (partially, because we do not back up the foreman maintain data which contains the list of disabled plans)

[root@centos8-stream-katello-nightly ~]# hammer sync-plan create --name test --organization "Default Organization" --enabled true --sync-date "2024-06-15" --interval "daily"
Sync plan created.
[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE          | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1  | test | 2024/06/15 00:00:00 | daily    | yes     |                 | 4                 
---|------|---------------------|----------|---------|-----------------|-------------------


[root@centos8-stream-katello-nightly ~]# foreman-maintain backup offline /var/tmp/off
**** BACKUP Complete, contents can be found in: /var/tmp/off/katello-backup-2024-06-15-09-28-48 ****

[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE          | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1  | test | 2024/06/15 00:00:00 | daily    | yes     |                 | 4                 
---|------|---------------------|----------|---------|-----------------|-------------------

[root@centos8-stream-katello-nightly ~]# foreman-maintain restore /var/tmp/off/katello-backup-2024-06-15-09-28-48/

[root@centos8-stream-katello-nightly ~]# hammer sync-plan list --organization "Default Organization"
---|------|---------------------|----------|---------|-----------------|-------------------
ID | NAME | START DATE          | INTERVAL | ENABLED | CRON EXPRESSION | RECURRING LOGIC ID
---|------|---------------------|----------|---------|-----------------|-------------------
1  | test | 2024/06/15 00:00:00 | daily    | no      |                 | 4                 
---|------|---------------------|----------|---------|-----------------|-------------------

evgeni avatar Jun 15 '24 08:06 evgeni