scylla-manager
scylla-manager copied to clipboard
Do not compare SSTable checksums for 2024.1 backups
Scylla Enterprise 2024.1 introduced as a default support for UUIDs in SSTable names. https://github.com/scylladb/scylladb/pull/13932
Scylla manager no longer needs to compare checksums when copying files from local to storage, as the SSTable uniqueness is guaranteed by the UUID and the situation where SSTable have the same name and size, but different content is mitigated.
The above is valid for backups made on >= 2024.1 only. We still need to support old approach for older Scyllas.
Issue on QA side https://github.com/scylladb/qa-tasks/issues/1397
More details of why we have to add this support is here https://github.com/scylladb/scylla-enterprise/issues/4126#issuecomment-2075049181
grooming notes
We must check the version of Scylla that we perform the backup task against. If the version uses UUID in the SSTable names, (>= 2024.1) then we must disable checksum comparision during the move operation performed by RClone.
(per @Michal-Leszczynski) We can check at runtime if the SSTable name includes the UUID or not. If it's not, then we must enable checksum, if it is, we must disable.
Update:
SSTable specification includes .crc32 file which is actually the checksum created by scylla server. It's enough to compare the content of .crc32 inlcuded into created snapshot for every SSTable to the content of its remote equivalent. If they are equal, then just remove the SSTable from snapshot dir.
The solution above doesn't require to have Scylla supporting UUIDs in the SSTable names. It's just a general solution.
SSTable specification https://opensource.docs.scylladb.com/stable/architecture/sstable/sstable3/sstables-3-data-file-format.html
We need to test the scenario described in the previous comment against the snapshot that contains hundreds of SSTables to see the performance of this operation. It includes reading content of every crc32 file that is part of the current snapshot.
RFC for the new deduplication stage: https://docs.google.com/document/d/1EtGlF6UGNy34D_7QsnCheaukp3UwVObZU56PBdd0CQ8/edit?usp=sharing