sanoid icon indicating copy to clipboard operation
sanoid copied to clipboard

Add target snapshot deletion.

Open mat813 opened this issue 4 years ago • 28 comments

This is the feature I was talking about in #488 and #477.

I am open to suggestions on the name of the option. (And on the code)

mat813 avatar Mar 13 '20 13:03 mat813

@mat813

can we change the option and description to something like:

 --delete-target-snasphots
    With this argument snapshots which are missing on the source will be destroyed on the target. Use this if you only want to handle snapshots on the source.

But it's up to you.

phreaker0 avatar May 01 '20 14:05 phreaker0

Fine by me.

mat813 avatar May 11 '20 14:05 mat813

Fix line 846 'delete-target-snaspshots' should be 'delete-target-snapshots' Fix line 1920 same typo Fix line 217 Fix line 28

tonymmm1 avatar May 13 '20 23:05 tonymmm1

Oups, sorry for the typo.

mat813 avatar May 15 '20 15:05 mat813

Great! This is exactly what I was looking for (use case: offshore backup of ZFS pool that uses sanoid for snapshots).

Thanks a lot for the work 👍

kevinvalk avatar Jun 17 '20 15:06 kevinvalk

@mat813, I just tried it out on my mirrored ZFS volume that accrued MANY snapshots that were purged a long time ago. However, during the sync process I got the following error (exactly 3 times going from 5262 snapshots to 153):

could not find any snapshots to destroy; check snapshot names.

I think this is a similar issue to issue https://github.com/jimsalterjrs/sanoid/issues/9, as I am also using the --recursive flag. Not sure what the best way to handle this is. I guess it is "fine" that a snapshot cannot be destroyed on the other end when it does not exist. If it is gone, great right?! Hence the question, could we just ignore such errors gracefully?

kevinvalk avatar Aug 11 '20 22:08 kevinvalk

@mat813, I just tried it out on my mirrored ZFS volume that accrued MANY snapshots that were purged a long time ago. However, during the sync process I got the following error (exactly 3 times going from 5262 snapshots to 153):

could not find any snapshots to destroy; check snapshot names.

I think this is a similar issue to issue #9, as I am also using the --recursive flag. Not sure what the best way to handle this is. I guess it is "fine" that a snapshot cannot be destroyed on the other end when it does not exist. If it is gone, great right?! Hence the question, could we just ignore such errors gracefully?

Mmmm, I do not think it is because of the --recursive flag, my code only deletes a single snapshot at a time, without doing recursive snapshot deletion. (aka, the -r flag is not used, in fact, no flag are used when deleting snapshots.) Also, I, too, use the --recursive syncoid flag.

I have not experienced this message yet, I am unsure of why it would happen.

mat813 avatar Aug 17 '20 14:08 mat813

Ok, reading syncoid's code, from what I can gather, this could happen if there was an obsolete syncoid_hostname_* that had not been removed for some reason.

Maybe by chance, the sync was interrupted between when the send|receive had terminated but before the obsolete sync snaps were removed.

I am adding >/dev/null 2>&1 to the commands to shut them up.

mat813 avatar Aug 17 '20 14:08 mat813

Wondering if I should not remove the redirection shutting up the zfs destroy now that there is a warning telling you what command failed.

mat813 avatar Aug 18 '20 12:08 mat813

Not sure if this helps your decision.

I have a cronjob running that synchronizes one ZFS pool to an offshore ZFS pool (hence the reason to remove old snaps etc). This is just as an insurance if the original computer that hosts ZFS pool is gone. So I run the job with the --quiet flag as I only want to hear about errors and things that are actually going wrong through cron alerts. However, as soon as syncoid prints something like "already destroyed", I get the alert. Hence my original request :-)

It should take into account the --quiet flag. If it is present, it should print NO output about already removed snapshots (according to help --quiet | Suppress non-error output). If the flag is not present, it should try to be helpful and give you insight into the warning (and that is probably showing the zfs destroy error output).

kevinvalk avatar Aug 18 '20 12:08 kevinvalk

hey @mat813, I am getting an syntax error when I try to run syncoid (with/without any args) with this patch.

Error msg:

syntax error at /usr/sbin/syncoid line 852, near ""$targetsudocmd $zfscmd destroy $targetfsescaped@$snap failed: $?";" Type of arg 1 to Capture::Tiny::capture must be block or sub {} (not reference constructor) at /usr/sbin/syncoid line 1175, near "};" Execution of /usr/sbin/syncoid aborted due to compilation errors.

System: Ubuntu 20.04 with Perl v5.30.0

ruvi-d avatar Aug 24 '20 08:08 ruvi-d

hey @mat813, I am getting an syntax error when I try to run syncoid (with/without any args) with this patch.

Oups, I had a local uncommitted fix for that, I merged it with the commit, and then I rewrote the handling of these commands output.

mat813 avatar Aug 24 '20 11:08 mat813

this will need rebasing, sorry folks. too many PRs stacked up; I've been badly overworked this year.

jimsalterjrs avatar Nov 01 '20 22:11 jimsalterjrs

rebased.

mat813 avatar Nov 02 '20 08:11 mat813

Ping?

mat813 avatar May 03 '21 14:05 mat813

Just an observation: you accidentally removed --preserve-recordsize from README during rebase.

stasjok avatar May 08 '21 07:05 stasjok

Ah, yes, rebased again.

mat813 avatar May 10 '21 11:05 mat813

This looks really useful, and I'm hoping to take advantage of it shortly. But after running it several times, and nothing happening, I took a quick look at the code. I don't know Perl particularly well, but it looks like the way this is structured the snapshot delete will only happen after a successful sync; in my case, I have no newer snapshots on the source so the delete doesn't happen.

And that got me thinking about what would happen if the destination was actually full, and your syncs were failing because of that? On the face of it, you do a run with --delete-target-snapshots set and the destination is cleaned up to match the source, but would that actually happen or would the sync process still be "blocked" by virtue of the fact that there was no space for a sync to work, and without a successful sync the delete step will never actually run?

Assuming I've understood that logic correctly maybe the README could be updated to explain the circumstances under which a delete cycle will actually run?

g-a-c avatar May 11 '21 17:05 g-a-c

Syncoid's job is to sync missing snapshots. It is not meant to cleanup snapshots.

Sanoid's job is to take snapshots and clean them up.

The source locally runs Sanoid take and cleanup snapshots on the source. It can also run Syncoid to push backups to the destination.

The destination locally runs Sanoid to clean up snapshots on the destination. It can also run Syncoid to pull backups from the source.

So you need to configure Sanoid on the destination to cleanup snapshots on the destination.

redmop avatar May 11 '21 19:05 redmop

My target box is a TrueNAS "appliance" so it does not include Sanoid and I don't want to have to do unsupported things to it if I don't have to. I don't use the built-in TrueNAS sync tool because it is unreliable in comparison to Syncoid.

Since this PR is marked as accepted by another maintainer besides yourself, I assumed that "the project" approved of the functionality and I just had one minor clarification which could have been included in the documentation before it was merged to avoid confusion. However if you have a different view then that's fine, but you should probably ask phreaker0 about removing their approval and closing the PR.

g-a-c avatar May 11 '21 19:05 g-a-c

I don't use sanoid on either the source or the remote. I use other tools, on most servers, I use freebsd-snapshots and on my workstations I use zfstools. So, unless I want to have to manually have to cleanup snapshots on the remote servers that I use for backups, syncoid has to be able to clean things up, which is what the patch in this pull request does.

@g-a-c I usually let syncoid create its own snapshots (so, without --no-sync-snap) because otherwise, it is always complaining that stuff is missing, and it ends up sending more data than it could. I could add to the README that the cleanup is done after synchronisation though.

mat813 avatar May 17 '21 11:05 mat813

Is there any way this could be merged?

mat813 avatar Nov 04 '21 06:11 mat813

This is a really useful feature. Any chance it will be merged soon?

fnickander avatar Dec 03 '21 15:12 fnickander

I would need this feature too. Will this be merged?

crispyduck00 avatar Feb 19 '22 07:02 crispyduck00

Any chance this gets merged soon? Ever? I haven't tried @mat813's most recent branch yet, but I have been using one of his original branches for years for offsite replication.

veggiemike avatar Jul 20 '22 16:07 veggiemike

rebased.

mat813 avatar Jul 20 '22 18:07 mat813

@phreaker0 any chance that this could be merged soon? I can think of two situations where this would be very valuable:

  • when you can't install sanoid/syncoid on the destination, e.g. the example described above:

My target box is a TrueNAS "appliance" so it does not include Sanoid and I don't want to have to do unsupported things to it if I don't have to. I don't use the built-in TrueNAS sync tool because it is unreliable in comparison to Syncoid.

  • when the snapshots on the source are created by some other application (not sanoid) and you'd like to sync them with syncoid but can't use sanoid with --prune-snapshots on the destination because it doesn't understand the foreign naming scheme of the snapshots

asmartin avatar Sep 18 '22 04:09 asmartin

@jimsalterjrs wanna merge? looks good to me.

phreaker0 avatar Sep 18 '22 17:09 phreaker0

Awaiting this feature as well -- looks like exactly what I was looking for.

slynn1324 avatar Oct 20 '22 01:10 slynn1324

I updated the code to remove all snapshots at once, and not run one zfs destroy for each snapshot.

mat813 avatar Dec 15 '22 09:12 mat813