scylla-tools-java icon indicating copy to clipboard operation
scylla-tools-java copied to clipboard

backport uuid identifier related changes to support uuid based sstable identifiers

Open tchaikov opened this issue 1 year ago • 12 comments

uuid-based sstable identifier was introduced by Cassandra upstream in https://github.com/apache/cassandra/commit/0040fea3797ea3e497691e9d1e2660711c60ac4d. but our fork was based 3.x. so to support this feature, we need to merge the upstream changes or rebase on top of it. or, as put by Avi in https://github.com/scylladb/scylladb/pull/13932#issuecomment-1582277800, quoted here as well

We should probably drop sstableloader. We might develop a standalone format converter from Cassandra formats to our format, based on the Cassandra codebase. So migration would use either the Spark migrator, or (Cassandra sstables -> Scylla sstables) -> load'n'stream. This gives us one less reason to keep having tools/java/.

Note that Cassandra is now at "oa" format: https://github.com/apache/cassandra/commit/f16fb6765b8a3ff8f49accf61c908791520c0d6e.

without this change following test would be failing after enabling the uuid_sstable_identifier_enabled option introduced by https://github.com/scylladb/scylladb/pull/13932

the tests above are only a subset. following tools should be updated

  • sstabledump
  • sstableloader
  • sstableverify
  • sstablemetadata

tchaikov avatar Jun 08 '23 09:06 tchaikov

see also https://github.com/scylladb/scylladb/pull/13932

tchaikov avatar Jun 08 '23 09:06 tchaikov

I want to patch out sstabledump from all dtests. But I don't know when I will get around to do that, and certainly don't want to block the uuid migration on this.

denesb avatar Jun 14 '23 09:06 denesb

@denesb hi Botond, thank you for the remarks! the recent changes of

  • https://github.com/scylladb/scylla-dtest/pull/3220
  • https://github.com/scylladb/scylla-dtest/pull/3232

can enable us to proceed without being blocked by this issue or your initiative to ditch sstabledump. and your change will allow revert some of these dtest changes.

tchaikov avatar Jun 14 '23 09:06 tchaikov

@denesb hi Botond, i am going to add a wrapper around scylla dump-data to replace sstabledump, what do you think? i think once we have https://github.com/scylladb/scylladb/pull/14726 . we will be able to drop sstabledump.

tchaikov avatar Jul 24 '23 08:07 tchaikov

sstablemetadata also fails on uuid-based sstable identifier

juliayakovlev avatar Jul 26 '23 12:07 juliayakovlev

sstablemetadata also fails on uuid-based sstable identifier

thank you. added to the list.

tchaikov avatar Jul 27 '23 09:07 tchaikov

an alternative of backport uuid identifier changes is to implement it right in scylla. see https://github.com/scylladb/scylladb/issues/14856

tchaikov avatar Jul 27 '23 09:07 tchaikov

@tchaikov

this is breaking rolling upgrades tests as well https://github.com/scylladb/scylla-cluster-tests/blob/6e3d34cd1b9fc6533a4e5a3874fd36ad401b9a81/upgrade_test.py#L698

and a test for GC of tomestones: https://github.com/scylladb/scylla-cluster-tests/blob/6e3d34cd1b9fc6533a4e5a3874fd36ad401b9a81/longevity_tombstone_gc_test.py#L42

fruch avatar Jul 30 '23 08:07 fruch

@denesb hi Botond, i am going to add a wrapper around scylla dump-data to replace sstabledump, what do you think? i think once we have scylladb/scylladb#14726 . we will be able to drop sstabledump.

@tchaikov - is this still the plan?

mykaul avatar Nov 23 '23 14:11 mykaul

@mykaul no, in an offline discussion with Botond, he warned me that the output format of scylla dump-data was different from that of sstabledump. so unless we translate the former into the latter in the wrapper i imagined, it's a no-go. so we are actively replacing sstabledump in our tests. and mark it deprecated, and then plan to remove it in favor of scylla sstable after a grace period.

tchaikov avatar Nov 23 '23 14:11 tchaikov

@mykaul no, in an offline discussion with Botond, he warned me that the output format of scylla dump-data was different from that of sstabledump. so unless we translate the former into the latter in the wrapper i imagined, it's a no-go. so we are actively replacing sstabledump in our tests. and mark it deprecated, and then plan to remove it in favor of scylla sstable after a grace period.

I don't mind deprecating it, but it has docs implications. https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/sstabledump.html for example.

mykaul avatar Nov 23 '23 14:11 mykaul

https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/sstabledump.html

@mykaul hi Yaniv, please take a look at the master version https://opensource.docs.scylladb.com/master/operating-scylla/admin-tools/sstabledump.html . we are deprecating it.

tchaikov avatar Nov 23 '23 14:11 tchaikov