yugabyte-db icon indicating copy to clipboard operation
yugabyte-db copied to clipboard

[backup] Restore lost some amount of values

Open pilshchikov opened this issue 3 years ago • 3 comments

Jira Link: DB-3065

Description

Case:

  1. Load data - sample-apps workload SqlDataLoad 8 min load, 10 tables. All varchar
  2. At the last minute of load restart the random VM.
  3. Create a backup
  4. Restore on different namespace
  5. Check restored tables with old namespace
  6. Drop old namespace
  7. Repeat from No.1 4 times

Found that one table has different amount of values

> /home/yugabyte/tserver/bin/ysqlsh --c "select count(*) from backup_2460ec_wsqldataload_c11" -h <ip> -d backup_68562e
< count
 12675798
 (1 row) 
> /home/yugabyte/tserver/bin/ysqlsh --c "select count(*) from backup_2460ec_wsqldataload_c11" -h <ip> -d backup_fd2843
< count
 12675711
 (1 row)

Logs: logs.tar.gz

Version: 2.15.2.0-b40

pilshchikov avatar Jul 27 '22 08:07 pilshchikov

Is it possible some of the last values are not flushed yet and thus don't get backed up properly? Flushing them manually to check if that fixes it could work to verify if it's a cause. I had another bug with flushing and missing rows here: https://github.com/yugabyte/yugabyte-db/issues/12684

def- avatar Jul 27 '22 10:07 def-

@def- no, this one is different. I checked manually several times after an hour from restore and the results are still the same. You can do it by yourself, this universe is still running.

pilshchikov avatar Jul 27 '22 11:07 pilshchikov

@pilshchikov do you have a script that repros this, or can you easily extract one? I'd like to take a look at the sst files in the snapshot, the sst files in the backup, and the sst files in the cluster for the source table to sanity check the theory that the last few writes aren't making it to the backup. I can't confirm that through the logs.

druzac avatar Aug 02 '22 21:08 druzac