scylla-tools-java icon indicating copy to clipboard operation
scylla-tools-java copied to clipboard

sstableloader --ignore-missing-columns doesn't work with snapshot

Open amoskong opened this issue 4 years ago • 9 comments

Version

$ rpm -qa |grep scylla
scylla-kernel-conf-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-jmx-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-conf-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-tools-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-tools-core-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-python3-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-server-4.4.dev-0.20201117.df197e36fb8.x86_64

Description

I'm trying to verify fix of https://github.com/scylladb/scylla/issues/6990 The backup sstable files in /var/lib/scylla/data.backup/ can be successfully loaded by 'sstableloader -g v2 /var/lib/scylla/data.backup/'.

I try to use snapshot to recover the data in following scenario, but failed, and sstableloader didn't raise any error. There is a workaround to load the snapshot data, it's coping the snapshot sstable to a fake table directory(sstableloader treated the directory as real table directory) .

Test scenario

  1. create keyspace & table
CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

CREATE TABLE ks.tb (
    key int PRIMARY KEY,
    v1 int,
    v2 int,
    v3 int
);
  1. insert data
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (1,1,1,1);
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (2,1,1,1);
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (3,1,1,1);
  1. nodetool flush

  2. create snapshot

$ nodetool snapshot
Using /etc/scylla/scylla.yaml as the config file
Requested creating snapshot(s) for [all keyspaces] with snapshot name [1605767537777] and options {skipFlush=false}
Snapshot directory: 1605767537777
  1. drop column v2
cqlsh> alter table ks.tb drop v2;
cqlsh> select * from ks.tb ;

 key | v1 | v3
-----+----+----
   1 |  1 |  1
   2 |  1 |  1
   3 |  1 |  1

(3 rows)
  1. Clean test data by Truncate ks.tb
  2. try to load the snapshot by sstableloader
$ sstableloader -d localhost --ignore-missing-columns v2 /var/lib/scylla/data/ks/tb-e9ae67902a3011eb85ea000000000000/snapshots/1605767537777/
Using /etc/scylla/scylla.yaml as the config file
===== Using optimized driver!!! =====
  0% done.        0 statements sent (in        0 batches,        0 failed).
       0 statements generated.
       0 cql rows processed in        0 partitions.
       0 cql rows and        0 partitions deleted.
       0 local and        0 remote counter shards where skipped.

Expected result

The snapshot can be loaded by sstableloader

2a3011eb85ea000000000000/snapshots/1605767537777/* /tmp/ks/tb/


### Actually result

Nothing is loaded.

### Workaround

$ mkdir -p /tmp/ks/tb/ $ cp -r /var/lib/scylla/data/ks/tb-e9ae67902a3011eb85ea000000000000/snapshots/1605767537777/* /tmp/ks/tb/ $ sstableloader -d localhost --ignore-missing-columns v2 /tmp/ks/tb/ Using /etc/scylla/scylla.yaml as the config file ===== Using optimized driver!!! ===== 100% done. 3 statements sent (in 2 batches, 0 failed). 3 statements generated. 6 cql rows processed in 3 partitions. 0 cql rows and 0 partitions deleted. 0 local and 0 remote counter shards where skipped.


/CC @juliayakovlev @elcallio @roydahan 

amoskong avatar Nov 19 '20 06:11 amoskong

@slivne can you please assign?

roydahan avatar Nov 23 '20 20:11 roydahan

Wait, is the issue here that files in "snapshot" cannot be loaded? This is expected, as directory structure is used to determine keyspace and cf.

elcallio avatar Jan 20 '21 09:01 elcallio

Wait, is the issue here that files in "snapshot" cannot be loaded?

Yes.

This is expected, as directory structure is used to determine keyspace and cf.

To be honest, the UX is not good.

amoskong avatar Jan 20 '21 09:01 amoskong

That is very true. But an unfortunate result of how sstables work. Maybe newer formats contain more metadata (such as keyspace etc), but if not (or older version sstables) there is not all that much one can do beyond awkward guessing. I guess we could add -ks and -cf switches to avoid having to look at dir structure though.

elcallio avatar Jan 20 '21 12:01 elcallio

No help from md format...

elcallio avatar Jan 20 '21 12:01 elcallio

@elcallio please close it if it's not relevant anymore

DoronArazii avatar Nov 24 '22 16:11 DoronArazii

I'm reopening this for the moment because closing this issue has activated some broken dtests (marked with @pytest.mark.require("scylladb/scylla-tools-java#216")) and broke CI.

michoecho avatar Aug 19 '24 17:08 michoecho

Doh. Sorry about that. Will closing as "not planned" work?

elcallio avatar Aug 19 '24 18:08 elcallio

Doh. Sorry about that. Will closing as "not planned" work?

No. It has to stay open until someone fixes or deletes the broken dtests.

michoecho avatar Aug 19 '24 19:08 michoecho