zfs-replicate icon indicating copy to clipboard operation
zfs-replicate copied to clipboard

Cloned dataset wants to receive to original.

Open tschettervictor opened this issue 9 months ago • 9 comments

Greetings,

I just ran into this. Try this...

Create a dataset, run the script here a few times to verify it is backing up, then clone that dataset to another one with zfs send | zfs receive and attempt to run it again.

What I found is that it replicated the old one just fine, but when it tried to do the cloned one, it receives into the old dataset, which obviously causes it to fail.

Would this have anything to do with the way the script handles things?

Mar 04 08:41:15 zfs-replicate.sh[43190]: receiving incremental stream of tank/extensions/bastille/jails/desktopstreaming@autorep-03042025_1741102754 into backup/tank/extensions/bastille/jails/desktopstreaming@autorep-03042025_1741102754
Mar 04 08:41:16 zfs-replicate.sh[43190]: received 312B stream in 1 seconds (312B/sec)
...
Mar 04 08:41:23 zfs-replicate.sh[43190]: receiving incremental stream of tank/extensions/bastille/jails/desk2@autorep-03042025_1741102754 into backup/tank/extensions/bastille/jails/desktopstreaming@autorep-03042025_1741102754
Mar 04 08:41:23 zfs-replicate.sh[43190]: cannot restore to backup/tank/extensions/bastille/jails/desktopstreaming@autorep-03042025_1741102754: destination already exists
Mar 04 08:41:24 zfs-replicate.sh[43190]: destroying snapshot cmd=/usr/bin/ssh 192.168.1.132 /sbin/zfs destroy -r tank@autorep-03042025_1741102754
Mar 04 08:41:25 zfs-replicate.sh[43190]: deleting lockfile /tmp/.replicate.snapshot.lock
Mar 04 08:41:25 zfs-replicate.sh[43190]: deleting lockfile /tmp/.replicate.send.lock

Notice it is trying to receive 'desk2' into 'desktopstreaming'

@aaronhurt

tschettervictor avatar Mar 04 '25 15:03 tschettervictor

Can you also include your config there? If you cloned it and it still had the previous snapshot names I have a theory as to what's happening.

aaronhurt avatar Mar 04 '25 18:03 aaronhurt

#!/usr/bin/env sh
## zfs-replicate configuration file
# shellcheck disable=SC2034

## Datasets to replicate. These must be zfs paths not mount points.
## The format general format is "source:destination". The source is always
## considered authoritative. This holds true for reconciliation attempts with
## the "ALLOW_RECONCILIATION" option described below as well.
##
## Examples replicating a local source to a remote destination (PUSH):
##   - sourcePool/sourceDataset:destinationPool@host
##   - sourcePool/sourceDataset:destinationPool/destinationDataset@host
## Examples replicating from a remote source to a local destination (PULL):
##   - sourcePool/sourceDataset@host:destinationPool
##   - sourcePool/sourceDataset@host:destinationPool/destinationDataset
## Examples replicating a local source to a local destination:
##   - sourcePool/sourceDataset:destinationPool
##   - sourcePool/sourceDataset:destinationPool/destinationDataset
## Multiple space separated sets may be specified.
## Pools and dataset pairs must exist on the respective servers.
##
REPLICATE_SETS="[email protected]:backup/tank"

## Allow replication of root datasets.
## If "REPLICATE_SETS" contains root datasets and "ALLOW_ROOT_DATASETS" is
## NOT set to 1, root datasets will be skipped and a warning will be printed.
##
## 0 - disable (default)
## 1 - enable (use at your own risk)
##
ALLOW_ROOT_DATASETS=1

## Manual alteration of the source or destination datasets by removing
## snapshots often results in failure. It is expected that datasets configured
## for replication are a 1:1 copy of each other after the first script run.
## Setting this option to "1" allows the script to attempt reconciliation when
## source and destination datasets have diverged.
##
## NOTE: The source is always authoritative. Reconciliation will only
## affect the destination dataset. This script will NEVER modify the source
## as a means to prevent reconcile divergence between datasets.
##
## Setting this option to "1" will result in the following potentially
## destructive behavior for the destination dataset.
##
## - If the script is unable to find the source base snapshot
##   in the destination dataset. The script will fallback to a full send.
##   When combined with the "-F" option in the destination receive pipe,
##   this option will force a reconciliation. ZFS will automatically remove
##   snapshots in the destination that do not exist within the source.
## - If the script determines that replication snapshots exist in the
##   destination dataset, and no base snapshot is present in the source.
##   The script will remove ALL destination snapshots that appear to have been
##   created by this script and instruct ZFS to do a full send of the source
##   to the destination.
##
## These scenarios should never happen under normal circumstances.
## Setting "ALLOW_RECONCILIATION" to "1" will allow the script to push
## past failures caused by divergent source and destination datasets to
## create a 1:1 copy of the source in the destination.
##
## 0 - disable (default)
## 1 - enable (use at your own risk)
##
ALLOW_RECONCILIATION=1

## Option to recursively snapshot children of datasets contained
## in the replication set.
##
## 0 - disable (default)
## 1 - enable
##
RECURSE_CHILDREN=1

## The number of snapshots to keep for each dataset.
## Older snapshots, by creation date, will be deleted.
## A minimum of 2 snapshots must be kept for replication to work.
## This defaults to 2 if not set.
##
SNAP_KEEP=2

## Option to write logs to syslog via the "logger" tool. This option
## may be enabled or disabled independently from log file settings.
##
## 0 - disable
## 1 - enable (default)
##
#SYSLOG=1

## Optional logging facility to use with syslog. The default facility
## is "user" unless changed below. Other common options include local
## facilities 0-7.
## Example: local0, local1, local2, local3, local4, local5, local6, or local7
##
#SYSLOG_FACILITY="user"

## The following substitutions for current date information
## may be used in the "TAG" setting below.
## These are evaluated at runtime.
##   - %DOW% = Day of Week (date "+%a")
##   - %MOY% = Month of Year (date "+%m")
##   - %DOM% = Day of Month (date "+%d")
##   - %CYR% = Current Year (date "+%Y")
##   - %NOW% = Current Unixtime (date "+%s")

## String used for snapshot names and log tags.
## Example: pool0/someplace@autorep-08242024_1724527527
## The default is "%MOY%%DOM%%CYR%_%NOW%"
##
#TAG="%MOY%%DOM%%CYR%_%NOW%"

## The log file needs to start with "autorep-" in order for log cleanup
## to work using the default below is strongly suggested. Leaving this commented out
## will disable the writing of the standalone log file. The "%TAG%" substitution
## and/or other date substitutions may be used. The default is "autorep-%TAG%.log"
## When enabled logs will be placed under the "LOG_BASE" path set above.
##
#LOG_FILE="autorep-%TAG%.log"

## Number of log files to keep. Note, this is only used
## if "LOG_BASE" is set to a non-empty value above.
## Older logs, by creation date, will be deleted.
## This defaults to 5 if not set.
##
#LOG_KEEP=5

## Set the destination for physical log files to reside. By default
## logging is done via syslog. This setting will always be treated as a
## directory and not a file.
##
LOG_BASE="/mnt/backup/scripts/zfs-replicate/logs"

## Path to the system "logger" executable.
## The default uses the first "logger" executable found in $PATH.
##
#LOGGER=$(which logger)

## Path to GNU "find" binary. Solaris find does not support the "-maxdepth"
## option, which is required to rotate log files.
## On solaris 11, GNU find is typically located at "/usr/bin/gfind".
## The default uses the first "find" executable in $PATH.
## This is NOT required when using syslog.
##
#FIND=$(which find)

## Path to the system "ssh" binary. You may also include custom arguments
## to SSH here or in the "DEST_PIPE_WITH_HOST" option above.
## Example: SSH="ssh -l root" to login as root to target host.
## The default uses the first "ssh" executable found in $PATH.
##
#SSH=$(which ssh)

## Path to the system "zfs" binary. The default uses the first "zfs"
## executable found in $PATH.
##
#ZFS=$(which zfs)

## Set the pipe to the destination pool. But DO NOT INCLUDE the pipe (|)
## character in this setting. Filesystem  names from the source will be
## sent to the destination. For increased transfer speed to remote hosts you
## may want to customize ssh ciphers or include mbuffer.
## The macro %HOST% string will be substituted with the value of the "@host"
## target in the replication set.
## The default WITH a "@host" option is "ssh %HOST% zfs receive -vFd"
## The default WITHOUT a "@host" option is "zfs receive -vFd".
##
#DEST_PIPE_WITH_HOST="$SSH %HOST% $ZFS receive -vFd"
#DEST_PIPE_WITHOUT_HOST="$ZFS receive -vFd"

## Command to check the health of a source or destination host.
## A return code of 0 is considered OK/available.
## This is only used when a replicate set contains an "@host" option.
## The macro string "%HOST%" will be substituted with the value of
## the "@host" target in the replicate set.
## The default command is "ping -c1 -q -W2 %HOST%".
##
#HOST_CHECK="ping -c1 -q -W2 %HOST%"```

tschettervictor avatar Mar 04 '25 18:03 tschettervictor

Okay, thanks so nothing fancy there in the replicate sets, my guess is that when you cloned the dataset on the source side, it also cloned the snapshots attached to that dataset. Meaning it was a full/deep clone and not just a shallow clone of the top level dataset. You could look at zfs list -t snapshot and see if there are duplicate named snapshots under the original and new cloned dataset.

aaronhurt avatar Mar 04 '25 18:03 aaronhurt

tank/extensions/bastille/jails/desk2@autorep-03032025_1741044911                    88K      -   120K  -
tank/extensions/bastille/jails/desk2/root@autorep-03032025_1741044911             6.84M      -  6.35G  -
tank/extensions/bastille/jails/desktopstreaming@autorep-03032025_1741044911          0B      -   120K  -
tank/extensions/bastille/jails/desktopstreaming/root@autorep-03032025_1741044911  13.6M      -  6.35G  -

These are the only ones i have of those two jails.

tschettervictor avatar Mar 04 '25 19:03 tschettervictor

Just an idea here. It seems that when the dataset gets cloned, the snapshots of it get cloned, and therefore do not exist on the receiving side.

This would require a full send to take place to populate the receiving side with the snapshot of the new dataset, but since it is already detecting an existing snapshot for the config, it doesn't do that and throws an error.

tschettervictor avatar Mar 07 '25 13:03 tschettervictor

Yes that does seem to be the issue. When cloning the dataset, the snapshots also get cloned, and zfs-replicate thinks it needs to do an incremental replication.

Deleting the snapshots for the newly cloned dataset resolves it by doing the full send first for that dataset.

Would there be any way to mitigate this besides deleting the snapshots for the cloned dataset?

tschettervictor avatar Mar 08 '25 16:03 tschettervictor

@aaronhurt Do we want to address this or should I close this issue?

tschettervictor avatar Mar 22 '25 02:03 tschettervictor

Hrm, that is kind of what ALLOW_RECONCILIATION does today. It checks if the local/source snapshots exist on the remote/destination ... and if not, it falls back to a full send. The current documented policy is that the local/source is always authoritative and only the remote/destination will ever be modified. I don't think I would want to introduce any options that would do potentially destructive things to the source.

There could be additional logging added though to list some troubleshooting steps such as removing local snapshots if a dataset was cloned.

aaronhurt avatar Mar 24 '25 19:03 aaronhurt

The ALLOW_RECONCILIATION will only work if ALL the source snapshots or destination snapshots are not present.

But because this is a cloned dataset, and I'm doing RECURSE_CHILDREN, it doesn't activate the RECONCILIATION mode.

Please just make a note that when cloning a dataset, make sure to delete the snapshots on the clone dataset that were created with zfs-replicate. Doing that fixed it for me.

tschettervictor avatar Mar 24 '25 20:03 tschettervictor