sanoid icon indicating copy to clipboard operation
sanoid copied to clipboard

Issues syncing

Open jbouter opened this issue 3 years ago • 5 comments

Hi, I'm having issues synchronising my data. I've nuked-and-paved the sync three times now, but keep running into the same issue.

Initial sync (which I initiated without mbuffer installed, canceled, installed mbuffer, and started over):

% syncoid --no-privilege-elevation --sendoptions="w" tank/home nas.example.nl:backup/employees/jeffrey/home
Resuming interrupted zfs send/receive from tank/home to backup/employees/jeffrey/home (~ 71.2 GB remaining):
71,3GiB 5:59:55 [3,38MiB/s] [===================================================================================================================================>] 100%            
cannot receive: failed to read from stream
cannot receive resume stream: dataset does not exist
CRITICAL ERROR:  zfs send -w  -t 1-11f1de51f2-110-789c636064000310a500c4ec50360710e72765a526973030582f85a8c1904f4b2b4e2d01c9f0c1e5d990e4932a4b528b81b4ce32e749d8f497e4a79766a630303c9eb744f5e3c117cf3c90e439c1f27989b9a9403a312f5b3f233f37d521b1b424bf382fb120dec8c0c850d7c052d7d038dec8c8cac0c0cad0203e373faf2423a712620f3703c25fc9f9b90545a9c5c5f9d9103109a83b61f24589e53029060019962b42 | pv -p -t -e -r -b -s 76398461468 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/syncoid-nas.example.nl-1632688920 nas.example.nl ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive  -s -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1' failed: 256 at /usr/bin/syncoid line 580.

A new attempt yields the same error:

% syncoid --no-privilege-elevation --sendoptions="w" tank/home nas.example.nl:backup/employees/jeffrey/home
Sending incremental tank/home@autosnap_2021-09-13_22:00:10_monthly ... syncoid_soyuz_2021-09-27:09:11:29-GMT02:00 (~ 19.5 GB):
cannot receive incremental stream: most recent snapshot of backup/employees/jeffrey/home does not                                                                 ]  2% ETA 1:27:59
match incremental source
mbuffer: error: outputThread: error writing to <stdout> at offset 0x1aee8000: Broken pipe
mbuffer: warning: error during output to <stdout>: Broken pipe
 456MiB 0:02:03 [3,69MiB/s] [=>                                                                                                                                   ]  2%            
CRITICAL ERROR:  zfs send -w  -I 'tank/home'@'autosnap_2021-09-13_22:00:10_monthly' 'tank/home'@'syncoid_soyuz_2021-09-27:09:11:29-GMT02:00' | pv -p -t -e -r -b -s 20892065912 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/syncoid-nas.example.nl-1632726688 nas.example.nl ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive  -s -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1' failed: 256 at /usr/bin/syncoid line 817.

I've also tried to do the same with the --no-resume flag, but keep running into issues:

% syncoid --no-resume --no-privilege-elevation --sendoptions="w" tank/home nas.example.nl:backup/employees/jeffrey/home
Sending incremental tank/home@autosnap_2021-09-14_07:27:24_daily ... syncoid_soyuz_2021-09-27:12:33:13-GMT02:00 (~ 19.2 GB):
cannot receive incremental stream: most recent snapshot of backup/employees/jeffrey/home does not                                                                 ] 17% ETA 1:16:28
match incremental source
mbuffer: error: outputThread: error writing to <stdout> at offset 0xd5900000: Broken pipe
mbuffer: warning: error during output to <stdout>: Broken pipe
3,39GiB 0:16:24 [3,53MiB/s] [=====================>                                                                                                               ] 17%            
CRITICAL ERROR:  zfs send -w  -I 'tank/home'@'autosnap_2021-09-14_07:27:24_daily' 'tank/home'@'syncoid_soyuz_2021-09-27:12:33:13-GMT02:00' | pv -p -t -e -r -b -s 20605962208 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/syncoid-nas.example.nl-1632738793 nas.example.nl ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive   -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1' failed: 256 at /usr/bin/syncoid line 817.

Destination host: Ubuntu 20.04, using kernel 5.11

Source host: Arch Linux, using linux kernel 5.10 (LTS) with ZFS DKMS installation

Syncoid version information:

% syncoid --version
/usr/bin/syncoid version 2.1.0
(Getopt::Long::GetOptions version 2.52; Perl version 5.34.0)

jbouter avatar Sep 27 '21 14:09 jbouter

Trying an incremental sync after a new clean sync that I did overnight, this time with --no-resume from the start:

~ % syncoid --no-resume --no-privilege-elevation --sendoptions="w" tank/home nas.example.nl:backup/employees/jeffrey/home
Sending incremental tank/home@autosnap_2021-09-13_22:00:10_monthly ... syncoid_soyuz_2021-09-28:09:29:46-GMT02:00 (~ 19.9 GB):
cannot receive incremental stream: most recent snapshot of backup/employees/jeffrey/home does not                                                   ]  2% ETA 1:30:22
match incremental source
mbuffer: error: outputThread: error writing to <stdout> at offset 0x1aee8000: Broken pipe
mbuffer: warning: error during output to <stdout>: Broken pipe
 456MiB 0:02:04 [3,67MiB/s] [=>                                                                                                                     ]  2%            
CRITICAL ERROR:  zfs send -w  -I 'tank/home'@'autosnap_2021-09-13_22:00:10_monthly' 'tank/home'@'syncoid_soyuz_2021-09-28:09:29:46-GMT02:00' | pv -p -t -e -r -b -s 21347828872 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/syncoid-nas.example.nl-1632814184 nas.example.nl ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive   -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1' failed: 256 at /usr/bin/syncoid line 817.

jbouter avatar Sep 28 '21 07:09 jbouter

Thinking it may have been my source (Arch laptop) I've moved back to Ubuntu LTS (Kde Neon, but that shouldn't matter) to have both client and server side be the same version. One thing has changed, I can no longer receive as my own user (??), but the error remains:

~ % syncoid --debug --no-privilege-elevation --sendoptions="w" tank/home [email protected]:backup/employees/jeffrey/home
DEBUG: checking availability of lzop on source...
DEBUG: checking availability of lzop on target...
DEBUG: checking availability of lzop on local machine...
DEBUG: checking availability of mbuffer on source...
DEBUG: checking availability of mbuffer on target...
DEBUG: checking availability of pv on local machine...
DEBUG: checking availability of zfs resume feature on source...
DEBUG: checking availability of zfs resume feature on target...
DEBUG: syncing source tank/home to target backup/employees/jeffrey/home.
DEBUG: getting current value of syncoid:sync on tank/home...
  zfs get -H syncoid:sync 'tank/home'
DEBUG: checking to see if backup/employees/jeffrey/home on ssh     -S /tmp/[email protected] [email protected] is already in zfs receive using ssh     -S /tmp/[email protected] [email protected] ps -Ao args= ...
DEBUG: checking to see if target filesystem exists using "ssh     -S /tmp/[email protected] [email protected]  zfs get -H name ''"'"'backup/employees/jeffrey/home'"'"'' 2>&1 |"...
DEBUG: getting current value of receive_resume_token on backup/employees/jeffrey/home...
ssh     -S /tmp/[email protected] [email protected]  zfs get -H receive_resume_token ''"'"'backup/employees/jeffrey/home'"'"''
DEBUG: no receive token found 
DEBUG: getting list of snapshots on tank/home using   zfs get -Hpd 1 -t snapshot guid,creation 'tank/home' |...
DEBUG: getting list of snapshots on backup/employees/jeffrey/home using ssh     -S /tmp/[email protected] [email protected]  zfs get -Hpd 1 -t snapshot guid,creation ''"'"'backup/employees/jeffrey/home'"'"'' |...
DEBUG: creating sync snapshot using "  zfs snapshot 'tank/home'@syncoid_soyuz_2021-10-17:14:11:17
"...
DEBUG: getting current value of -p used on backup/employees/jeffrey/home...
ssh     -S /tmp/[email protected] [email protected]  zfs get -H -p used ''"'"'backup/employees/jeffrey/home'"'"''
DEBUG: checking to see if backup/employees/jeffrey/home on ssh     -S /tmp/[email protected] [email protected] is already in zfs receive using ssh     -S /tmp/[email protected] [email protected] ps -Ao args= ...
DEBUG: rolling back target to backup/employees/jeffrey/home@autosnap_2021-10-15_15:45:01_monthly...
ssh     -S /tmp/[email protected] [email protected]  zfs rollback -R 'backup/employees/jeffrey/home'@'autosnap_2021-10-15_15:45:01_monthly'
DEBUG: getting estimated transfer size from source  using "  zfs send -w  -nP -I 'tank/home@autosnap_2021-10-15_15:45:01_monthly' 'tank/home@syncoid_soyuz_2021-10-17:14:11:17' 2>&1 |"...
DEBUG: sendsize = 3724507808
Sending incremental tank/home@autosnap_2021-10-15_15:45:01_monthly ... syncoid_soyuz_2021-10-17:14:11:17 (~ 3.5 GB):
DEBUG:  zfs send -w  -I 'tank/home'@'autosnap_2021-10-15_15:45:01_monthly' 'tank/home'@'syncoid_soyuz_2021-10-17:14:11:17' | pv -s 3724507808 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/[email protected] [email protected] ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive  -s -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1'
cannot receive incremental stream: most recent snapshot of backup/employees/jeffrey/home does not
match incremental source
mbuffer: error: outputThread: error writing to <stdout> at offset 0x4ec000: Broken pipe
mbuffer: warning: error during output to <stdout>: Broken pipe                                                                                              ]  0% ETA 0:02:45
21,6MiB 0:00:01 [20,0MiB/s] [>                                                                                                                              ]  0%            
CRITICAL ERROR:  zfs send -w  -I 'tank/home'@'autosnap_2021-10-15_15:45:01_monthly' 'tank/home'@'syncoid_soyuz_2021-10-17:14:11:17' | pv -s 3724507808 | lzop  | mbuffer  -q -s 128k -m 16M 2>/dev/null | ssh     -S /tmp/[email protected] [email protected] ' mbuffer  -q -s 128k -m 16M 2>/dev/null | lzop -dfc |  zfs receive  -s -F '"'"'backup/employees/jeffrey/home'"'"' 2>&1' failed: 256 at /usr/sbin/syncoid line 786.

jbouter avatar Oct 17 '21 12:10 jbouter

@jimsalterjrs You knowing ZFS so much better than I do, is this something you can easily pin-point and give directions for fixing to?

jbouter avatar Oct 17 '21 12:10 jbouter

Are you taking snapshots on the target locally, eg by running sanoid on the target with a production template / with take_snapshots enabled?

If so, you're most likely creating snapshots with matching names on both sides, which confuses attempts to replicate enormously. Snapshots should only be created directly on the source, where they're replicated to the target--the target shouldn't be creating its own.

jimsalterjrs avatar Oct 17 '21 13:10 jimsalterjrs

The target doesn't have sanoid installed, and it's not making any snapshots of backup/employees. All snapshots taking place there are custom made scripts that I've created just to sync between two servers - it was before I knew of sanoid's existence. I've set it up to "FreeBSD Mastery: Advanced ZFS" best practices though.

The only snapshots present are the ones it has tried to sync:

jeffrey@nas:~$ sudo zfs list -rt snapshot backup/employees
NAME                                                                 USED  AVAIL     REFER  MOUNTPOINT
backup/employees/jeffrey/home@autosnap_2021-10-15_15:45:01_monthly     8K      -     30.3G  -
backup/employees/jeffrey/home@autosnap_2021-10-15_15:45:01_daily       0B      -     30.3G  -

Could it be that the backup zpool has been created before there was encryption support (18.04, but has since upgraded to 20.04 with encryption support), and I am sending over an encrypted dataset?

jbouter avatar Oct 17 '21 21:10 jbouter