spilo icon indicating copy to clipboard operation
spilo copied to clipboard

Feature/ timeline id in clone section for Spilo

Open Sudeepta92 opened this issue 2 years ago • 5 comments

To modify Spilo image to process timeline id in the clone section.

Sudeepta92 avatar Jul 19 '22 08:07 Sudeepta92

@Sudeepta92 let's step back and explain what problem are you trying to solve?

The thing is that recovery_target_timeline has only very narrow use: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-TIMELINES

The default behavior of recovery is to recover to the latest timeline found in the archive. If you wish to recover to the timeline that was current when the base backup was taken or into a specific child timeline (that is, you want to return to some state that was itself generated after a recovery attempt), you need to specify current or the target timeline ID in recovery_target_timeline. You cannot recover into timelines that branched off earlier than the base backup.

This is exactly what postgres-operator avoids: every time you deploy a new cluster it gets a new and unique archive location. That is, we will never have the situation when two recovery attempts are performed with the same archive location, what makes this feature effectively useless...

CyberDem0n avatar Aug 03 '22 12:08 CyberDem0n

This is exactly what postgres-operator avoids: every time you deploy a new cluster it gets a new and unique archive location. That is, we will never have the situation when two recovery attempts are performed with the same archive location, what makes this feature effectively useless...

@CyberDem0n I'm not sure if I understand you correctly. It's not always true that a unique archive location is created for a database. This can be because of two reasons. First, you can specify that there should be no prefix and suffix within the archive location path (S3), which leads to a location which only holds the cluster name and version and no UID. Second, you can do an "in place restore" (described here) This will lead to a new cluster UID but in the case above where the UID is not represented in the archive location, this will lead to the same problem.

On the Operator repository, there is also a open PR to add the needed configuration to the postgresql manifest, they seem to be willing to implement this. (See here)

When having UIDs in the backup location path, this makes it nearly impossible to automate a restore process and make backups selectable over some kind of webinterface, cause you probably will not have a history of all UIDs that a database might had in the past, therefore you cannot make the backups selectable anymore using an automation. I think this is the reason why one would disable the prefix / suffix stuff.

Just my thoughts.

Philip

thedatabaseme avatar Aug 30 '22 10:08 thedatabaseme

you can specify that there should be no prefix and suffix within the archive location path (S3), which leads to a location which only holds the cluster name and version and no UID.

Well, if you shoot your own foot, it hurts badly. Just don't do it. Two different clusters must not be writing to the same archive location.

CyberDem0n avatar Aug 30 '22 10:08 CyberDem0n

you can specify that there should be no prefix and suffix within the archive location path (S3), which leads to a location which only holds the cluster name and version and no UID.

Well, if you shoot your own foot, it hurts badly. Just don't do it. Two different clusters must not be writing to the same archive location.

They are not different clusters, just a restored one which takes place from the original one (inplace restore). And second, I don't understand why you use an offensive term like "shoot in the foot", it's the concept of timelines in Postgres, so not a shady workaround.

Philip

thedatabaseme avatar Aug 30 '22 11:08 thedatabaseme

If the old cluster is still up and running and the new cluster is forked from the old one (with the new timeline), they are two different clusters. They just have two common things:

  1. cluster system identifier
  2. and the ancestor

If you are unlucky enough, both clusters will end up on the same timeline and comparable LSNs, and there will be clashes in WAL file names.

Even if the old cluster is down, there will be a mess in the archive after PITR.

I don't understand the offensiveness of the "shoot your own foot". It is a well-known idiom and in this context, it means that we shouldn't be creating problems in the first place in order to make a significant effort to solve these problems later.

CyberDem0n avatar Aug 30 '22 12:08 CyberDem0n