scylladb icon indicating copy to clipboard operation
scylladb copied to clipboard

Preliminary changes for multiple Compaction Groups

Open raphaelsc opened this issue 1 year ago • 5 comments

What's contained in this series:

  • Refactored compaction tests (and utilities) for integration with multiple groups
    • The idea is to write a new class of tests that will stress multiple groups, whereas the existing ones will still stress a single group.
  • Fixed a problem when cloning compound sstable set (cannot be triggered today so I didn't open a GH issue)
  • Many changes in replica::table for allowing integration with multiple groups

Next:

  • Introduce for_each_compaction_group() for iterating over groups wherever needed.
  • Use for_each_compaction_group() in replica::table operations spanning all groups (API, readers, etc).
  • Decouple backlog tracker from compaction strategy, to allow for backlog isolation across groups
  • Introduce static option for defining number of compaction groups and implement function to map a token to its respective group.
  • Testing infrastructure for multiple compaction groups (helpful when testing the dynamic behavior: i.e. merging / splitting).

raphaelsc avatar Sep 20 '22 22:09 raphaelsc

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2328/

scylladb-promoter avatar Sep 21 '22 01:09 scylladb-promoter

Looks good, just one question. Maybe we should rename column_family_for_tests to table_for_tests. Can be a follow-up.

Agree 100%. column_family is deprecated now and it's confusing that some places still use column_family whereas others table. Look at replica/database.hh and you'll see classes like column_family_stats and table_stats, and both being used by replica::table :-(

raphaelsc avatar Sep 21 '22 12:09 raphaelsc

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2328/

There are some tests failing, please look into it.

denesb avatar Sep 22 '22 03:09 denesb

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2328/

There are some tests failing, please look into it.

Will fix and resend. thanks

raphaelsc avatar Sep 22 '22 13:09 raphaelsc

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2357/

scylladb-promoter avatar Sep 22 '22 19:09 scylladb-promoter

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2358/

scylladb-promoter avatar Sep 22 '22 23:09 scylladb-promoter

Where changelog?

denesb avatar Sep 23 '22 05:09 denesb

Where changelog?

v2:

raphaelsc avatar Sep 23 '22 12:09 raphaelsc

Looks good, I left a couple of questions.

What about backlog computations? I think they need to be done per compaction group now (and summed to get the overall backlog), but the memory input needs to be divided by the number of compaction groups to account for the fact that each LSM tree is smaller by that factor.

avikivity avatar Sep 25 '22 14:09 avikivity

Looks good, I left a couple of questions.

What about backlog computations? I think they need to be done per compaction group now (and summed to get the overall backlog), but the memory input needs to be divided by the number of compaction groups to account for the fact that each LSM tree is smaller by that factor.

Agree. I said in the cover letter that backlog calculation per group will come next. The series is more than 10 patches already so I decided to split even the preliminary work. The next series should contain the backlog work in addition to introducing static groups via config option.

raphaelsc avatar Sep 25 '22 14:09 raphaelsc

Where's the extracted fix? I assume we should merge it first?

avikivity avatar Sep 28 '22 14:09 avikivity

Where's the extracted fix? I assume we should merge it first?

Oops, missed ur question. Yes, it should be merged first. Extracting it now and will open a GH issue too.

raphaelsc avatar Oct 01 '22 14:10 raphaelsc

Looks good, I left a couple of questions.

What about backlog computations? I think they need to be done per compaction group now (and summed to get the overall backlog), but the memory input needs to be divided by the number of compaction groups to account for the fact that each LSM tree is smaller by that factor.

With a single group, a table produces backlog B

If a table has N groups, each group's LSM tree will be smaller by N (with efficient partitioning, I'd expect that L0 sstables will be smaller by N as memtable size for each group will be smaller by N too).

So the backlog for each group is B / N

But when summing the backlog of all groups, we have B / N * N = B.

The normalization happens on the sum of backlog of all trackers, so I don't think we have to account for number of groups when normalizing the backlog.

But, my intention (as documented in the cover letter) is to decouple the backlog tracker from compaction_strategy, allowing a given table to have multiple trackers (one for each group)

raphaelsc avatar Oct 01 '22 17:10 raphaelsc

Where's the extracted fix? I assume we should merge it first?

Oops, missed ur question. Yes, it should be merged first. Extracting it now and will open a GH issue too.

Done: https://github.com/scylladb/scylladb/pull/11682

raphaelsc avatar Oct 01 '22 17:10 raphaelsc

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2570/

scylladb-promoter avatar Oct 05 '22 20:10 scylladb-promoter

v3: rebased

raphaelsc avatar Oct 06 '22 00:10 raphaelsc

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2571/

scylladb-promoter avatar Oct 06 '22 03:10 scylladb-promoter

Test Result (2 failures / ±0) test_topology.test_remove_node_add_column non-boost tests.test_topology.release.2415

@kbr- does it ring a bell to you?

raphaelsc avatar Oct 06 '22 13:10 raphaelsc

We suspect this PR will fix it: https://github.com/scylladb/scylladb/pull/11691 It's currently in review.

kbr- avatar Oct 06 '22 13:10 kbr-

We suspect this PR will fix it: #11691 It's currently in review.

thanks.

raphaelsc avatar Oct 06 '22 13:10 raphaelsc