nexus-operator icon indicating copy to clipboard operation
nexus-operator copied to clipboard

Implement backup persistent storage

Open ricardozanini opened this issue 5 years ago • 5 comments

Nexus has a built in backup capability. Would be interesting to also have a this feature supported by the operator by providing a persistent storage to it. See: https://help.sonatype.com/repomanager3/backup-and-restore/configure-and-run-the-backup-task

Proposal

The Nexus CRD interface would have a switch to turn backup on/off. If "on", the operator would create a PVC for it, and call the internal Nexus API to create this task for the admin, setting the backup path to the volume mount.

Also, the "notification" e-mail should be added to the interface: an attribute describing the adminEmail and a backup structure with the notificationEmail on it. If the former is empty, we would take the adminEmail.

Structure suggestion:

apiVersion: apps.m88i.io/v1alpha1
kind: Nexus
metadata:
  name: nexus3
spec:
  (...)
  adminEmail: [email protected]
  backup:
    enabled: true
    notificationEmail: [email protected]
    # ideally greater than the one set for the service
    volumeSize: 10Gi  
   (...)

ricardozanini avatar May 13 '20 11:05 ricardozanini

That would be great, nice idea! I'm a bit worried about how we would report anything going wrong with this task. The pods would be in running state even if this failed and I assume we'd throw an event via the reconcile loop if there was an error, but it can take the admin a minute or two to realize they need to check the Nexus resource status even if all pods are running and healthy.

Not that I don't think that's reasonable, but Operators are a new framework, it may confuse users that are not familiar with it. What do you think? Do you also see this as a potential problem? Any ideas to solve it?

We could put a warning in the documentation, but maybe you know of some mechanism I don't :crossed_fingers: :grin:

LCaparelli avatar May 13 '20 12:05 LCaparelli

If what fails? The API calls to create the backup task? That would be an event in the console. But for the operator side it's just an API call to the nexus3:8081 internal service. If that fails, we inform the admin in the event message and life goes on. Since it will be a goroutine (the calls won't be made during the reconcile loop), we would try again until the task is created. Once created, the admin/user would see its status at the Nexus Web Console. 👯

ricardozanini avatar May 13 '20 13:05 ricardozanini

If what fails? The API calls to create the backup task?

Yup.

If that fails, we inform the admin in the event message and life goes on.

Yeah, this is the part I am concerned with. Assuming the Kubernetes console is similar to Minikube's dashboard we wouldn't see this event popping up as the dashboard doesn't display the status of custom resources, such as the Nexus CR. It would show an all-green overview for deployments, pods, replicasets, but there wouldn't be anything hinting to an issue with the backup activity. The user could assume all is fine and only realize there are no backups when it's too late.

I don't really see anything we can do other than what you're saying and documenting a friendly reminder on the section describing this feature along the lines of "Hey! Remember to check the state of your Nexus CR to make sure nothing went wrong when setting up your backup activity". What do you think?

LCaparelli avatar May 13 '20 14:05 LCaparelli

I think that there's some monitoring applications out there that look at the "Events" API, that an admin probably will/should take a look. There's no much for us to do unless documenting, alerting and logging the problem. "All green" is correct since our pod would be corrected deployed. =D

ricardozanini avatar May 14 '20 11:05 ricardozanini

Depends on #5

LCaparelli avatar May 26 '20 19:05 LCaparelli