scylla-tools-java
scylla-tools-java copied to clipboard
sstableloader --ignore-missing-columns doesn't work with snapshot
Version
$ rpm -qa |grep scylla
scylla-kernel-conf-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-jmx-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-conf-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-tools-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-tools-core-4.4.dev-0.20201117.df197e36fb8.noarch
scylla-python3-4.4.dev-0.20201117.df197e36fb8.x86_64
scylla-server-4.4.dev-0.20201117.df197e36fb8.x86_64
Description
I'm trying to verify fix of https://github.com/scylladb/scylla/issues/6990 The backup sstable files in /var/lib/scylla/data.backup/ can be successfully loaded by 'sstableloader -g v2 /var/lib/scylla/data.backup/'.
I try to use snapshot to recover the data in following scenario, but failed, and sstableloader didn't raise any error. There is a workaround to load the snapshot data, it's coping the snapshot sstable to a fake table directory(sstableloader treated the directory as real table directory) .
Test scenario
- create keyspace & table
CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE ks.tb (
key int PRIMARY KEY,
v1 int,
v2 int,
v3 int
);
- insert data
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (1,1,1,1);
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (2,1,1,1);
insert INTO ks.tb (key, v1, v2, v3 ) VALUES (3,1,1,1);
-
nodetool flush
-
create snapshot
$ nodetool snapshot
Using /etc/scylla/scylla.yaml as the config file
Requested creating snapshot(s) for [all keyspaces] with snapshot name [1605767537777] and options {skipFlush=false}
Snapshot directory: 1605767537777
- drop column v2
cqlsh> alter table ks.tb drop v2;
cqlsh> select * from ks.tb ;
key | v1 | v3
-----+----+----
1 | 1 | 1
2 | 1 | 1
3 | 1 | 1
(3 rows)
- Clean test data by
Truncate ks.tb
- try to load the snapshot by sstableloader
$ sstableloader -d localhost --ignore-missing-columns v2 /var/lib/scylla/data/ks/tb-e9ae67902a3011eb85ea000000000000/snapshots/1605767537777/
Using /etc/scylla/scylla.yaml as the config file
===== Using optimized driver!!! =====
0% done. 0 statements sent (in 0 batches, 0 failed).
0 statements generated.
0 cql rows processed in 0 partitions.
0 cql rows and 0 partitions deleted.
0 local and 0 remote counter shards where skipped.
Expected result
The snapshot can be loaded by sstableloader
2a3011eb85ea000000000000/snapshots/1605767537777/* /tmp/ks/tb/
### Actually result
Nothing is loaded.
### Workaround
$ mkdir -p /tmp/ks/tb/ $ cp -r /var/lib/scylla/data/ks/tb-e9ae67902a3011eb85ea000000000000/snapshots/1605767537777/* /tmp/ks/tb/ $ sstableloader -d localhost --ignore-missing-columns v2 /tmp/ks/tb/ Using /etc/scylla/scylla.yaml as the config file ===== Using optimized driver!!! ===== 100% done. 3 statements sent (in 2 batches, 0 failed). 3 statements generated. 6 cql rows processed in 3 partitions. 0 cql rows and 0 partitions deleted. 0 local and 0 remote counter shards where skipped.
/CC @juliayakovlev @elcallio @roydahan
@slivne can you please assign?
Wait, is the issue here that files in "snapshot" cannot be loaded? This is expected, as directory structure is used to determine keyspace and cf.
Wait, is the issue here that files in "snapshot" cannot be loaded?
Yes.
This is expected, as directory structure is used to determine keyspace and cf.
To be honest, the UX is not good.
That is very true. But an unfortunate result of how sstables work. Maybe newer formats contain more metadata (such as keyspace etc), but if not (or older version sstables) there is not all that much one can do beyond awkward guessing. I guess we could add -ks and -cf switches to avoid having to look at dir structure though.
No help from md format...
@elcallio please close it if it's not relevant anymore
I'm reopening this for the moment because closing this issue has activated some broken dtests (marked with @pytest.mark.require("scylladb/scylla-tools-java#216")
) and broke CI.
Doh. Sorry about that. Will closing as "not planned" work?
Doh. Sorry about that. Will closing as "not planned" work?
No. It has to stay open until someone fixes or deletes the broken dtests.