charts icon indicating copy to clipboard operation
charts copied to clipboard

feat(cluster): Add support for replication and import, volumeSnapshot recovery

Open dragoangel opened this issue 1 year ago • 6 comments

This PR adds functionality for:

Breaking changes:

  • removes .Values.backups.enabled and set .Values.backups.objectStorage.providers to "". Expect from users to set desired provider and configure it under .Values.backups.objectStorage.providersSettings, otherwise cloud backups will not work. Settings related to objectStorage was moved from .Values.backups under .Values.backups.objectStorage so they are not tangled with volumeSnapshot settings
  • changes how recovery is working. As we have a lot of recovery methods, user should choose first method and configure it under methodSettings.<chosen-method>.

Also this PR provides other functional improvements:

  • adds verifications checks to cover all mandatory logic. It will not allow user to deploy not fully configured backup, or enable restore mode without properly setting restore method and/or it's parameters (each method has own mandatory and options settings)
  • thoughtful division of many options and nested-dependent functions, which will make it possible to conveniently expand the existing functionality if necessary and not cause confusion. For example: we have a new method of data backups in CloudNative-PG CR, all we need to maintain it is to add it to the .Values.backups.<chosen-method>, describe the necessary checks in _helper.tpl and add a renderer in CloudNative-PG CR already reliably checked variables. Then we usually will need restore from this method, so we will add .Values.recover.methodSettings.<chosen-method> and this will automatically extend options for .Values.recover.method, as it not hardcoded, but checked dynamically from .Values.recover.methodSettings. We will also need to add necessary checks in _helper.tpl and add a renderer in CloudNative-PG CR for restore section.

dragoangel avatar Aug 26 '24 22:08 dragoangel

@itay-grudev I also have strong feeling that better put all cloud related settings under dedicated level, so everything starting from provider (224 line) and ending data.jobs (279 line) will be placed under objectStorage section. What do you think?

From my view it will simplify reading values file 😊 and will give understanding which settings belong to which logic.

Also have a question: rotation applies to snapshots as well, right? And if yes - then if we choose snapshotOwnerReference as none or cluster it will not work? If this the case I think better not user change it by hands.

dragoangel avatar Aug 27 '24 00:08 dragoangel

This PR needs

  • update to values in README.md
  • an example values file for backup and restore

Hi @cam-at-tactiq , yes I know, yesterday I was created couple of functional PRs and almost in every of them need to do it, thanks for heads up. I will add them today, yesterday was too much exchasted.

For this PR also I need a bit feedback on questions I commented :)

If we will decide move inside sub path of objectStorage I will also need to change all examples that use them and tests 🙂

One minus of my PR is breaking change, I think it will be merged not now, as it would require major version bump which @itay-grudev would like not to do right now.

The good thing about charts that you can maintain your own fork without even hosting own helm registry by using a dependency file://../cloudnative-pg-charts/charts/cluster

dragoangel avatar Aug 27 '24 06:08 dragoangel

@itay-grudev I also have strong feeling that better put all cloud related settings under dedicated level, so everything starting from provider (224 line) and ending data.jobs (279 line) will be placed under objectStorage section. What do you think?

I decided to move everything under dedicated section, it greatly improves readability, plus having providerSettings section allows to dynamically check which provider is supported. I will try align recovery to same approach.

dragoangel avatar Aug 27 '24 10:08 dragoangel

Hi @cam-at-tactiq, if you have some time - could you please test my PR? From my tests everything works good, but second-third view wouldn't be out of place. Thank you in advance.

Hi @itay-grudev, I not done tests via chainsaw stuff, they require some knowledge and secrets about infra it will be run on. If you not mind can you please add them? I provided all necessary examples, so I hope there should be no issues.

If you have any questions or propositions about this PR, do not hesitate ping me in Slack 😉. I understand it is big, and will require quite time to review, and yes it is BREAKING CHANGE

dragoangel avatar Aug 31 '24 01:08 dragoangel

Looks like this issue is stale for more than 2months now. I've been waiting for the external database support so I can start migrating old databases to CNPG.

What seems to be blocking the merge?

rc-networks avatar Nov 15 '24 08:11 rc-networks

No idea tbh, we have two decent PRs that handle this issue but neither have gained traction with maintainers :/

cam-at-tactiq avatar Nov 17 '24 03:11 cam-at-tactiq