scylla-ccm icon indicating copy to clipboard operation
scylla-ccm copied to clipboard

node: stress_process: specifiy -schema replication(strategy=NetworkTopologyStrategy) by default

Open bhalevy opened this issue 1 year ago • 10 comments

Currently stress creates the keyspace with SimpleReplication which is deprecated and inhibits tablets.

This change passes the strategy=NetworkTopologyStrategy replication option by default, if it is not otherwise specified.

bhalevy avatar Jul 03 '24 13:07 bhalevy

For example, see the test_refresh_and_restart_after_compaction_strategy_change and test_reshard_after_compaction_strategy_and_smp_change dtests:

        node1.stress(["write", "n=0", "no-warmup", "-schema", "replication(factor=1)", "-rate", "threads=1"])

bhalevy avatar Jul 03 '24 13:07 bhalevy

@fruch what python version do we use ccm ci? So we need backward compatibility?

bhalevy avatar Jul 03 '24 18:07 bhalevy

@fruch what puthin version do we use ccm ci? So we need backward compatibility?

as the CI says, we support python3.8-3.12 (same compatibility of the python-driver that use ccm in it's tests)

so no usage of walrus operator

fruch avatar Jul 03 '24 20:07 fruch

@bhalevy

few comments:

  1. soon we'll need to extract the c-s usage out of ccm - i.e. once c-s won't be part of unified package - and it would be packaged on it's own
  2. lets raise an issue in java-tools repo (until c-s would be on it's own repo) to do this on it's own by default

fruch avatar Jul 03 '24 20:07 fruch

@bhalevy

also, why aren't we fixing the dtest to do what we expect ? wouldn't it be easier to reason about ?

fruch avatar Jul 03 '24 20:07 fruch

@bhalevy

also, why aren't we fixing the dtest to do what we expect ? wouldn't it be easier to reason about ?

Not all dtest specify the schema for c-stress as they don't care. We can do the wrapping in dtest rather than in ccm, but currently we still call node.stress(), that already manipulates the stress command line args, so it seems like the right place to do so.

bhalevy avatar Jul 04 '24 09:07 bhalevy

  • fixed the match and replace algorithm
  • and searching for "NetworkTopologyStrategy" in opt
  • refrain from using the walrus operator

bhalevy avatar Jul 04 '24 09:07 bhalevy

@bhalevy

few comments:

  1. soon we'll need to extract the c-s usage out of ccm - i.e. once c-s won't be part of unified package - and it would be packaged on it's own
  2. lets raise an issue in java-tools repo (until c-s would be on it's own repo) to do this on it's own by default

https://github.com/scylladb/scylla-tools-java/issues/400

bhalevy avatar Jul 04 '24 10:07 bhalevy

BTW, for dtest I'm not sure we need to use cassandra-stress at all. We could other tools, or just provide a simple load-generator in python that would use the python driver. I don't think there's anything in particular in scylla-stress that we need in dtest...

bhalevy avatar Jul 04 '24 10:07 bhalevy

Seen a problem in e.g. test_nodetool_snapshot_during_major_compaction. Adding another -schema option for the replication strategy is not possible.

13:10:19,742 1838328 ccm                            DEBUG    cluster.py          :754  | test_nodetool_snapshot_during_major_compaction: node1: Executing ['/home/bhalevy/.ccm/scylla-repository/local_tarball/scylla-core-package/scylla-tools/tools/bin/cassandra-stress', 'write', 'n=1000000', '-rate', 'threads=10', '-schema', 'compaction(strategy=SizeTieredCompactionStrategy,enabled=false)', '-node', '127.0.86.1', '-port', 'jmx=7199', '-schema', 'replication(strategy=NetworkTopologyStrategy,factor=1)']
13:10:20,275 1838328 errors                         ERROR    conftest.py         :202  | test_nodetool_snapshot_during_major_compaction: test failed: 
self = <snapshot_test.TestSnapshot object at 0x7fac7c3df800>

    def test_nodetool_snapshot_during_major_compaction(self):
        def run_compaction(node):
            logger.info("Start compaction by command")
            node.compact()
            logger.info("Compaction done")
    
        cluster = self.cluster
        cluster.populate(1).start()
        node1 = cluster.nodelist()[0]
    
        ks_name = "keyspace1"
        table_name = "standard1"
    
        logger.info("Run stress command")
        num_keys = 1000000 if self.cluster.scylla_mode != "debug" else 10000
>       results = node1.stress(["write", f"n={num_keys}", "-rate", "threads=10", "-schema", "compaction(strategy=SizeTieredCompactionStrategy,enabled=false)"])

snapshot_test.py:431: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../.pyenv/versions/3.12.3/envs/dtest-3.12.3/lib/python3.12/site-packages/ccmlib/node.py:1342: in stress
    return handle_external_tool_process(p, ['stress'] + stress_options)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

process = <Popen: returncode: 1 args: ['/home/bhalevy/.ccm/scylla-repository/local_tar...>
cmd_args = ['stress', 'write', 'n=1000000', '-rate', 'threads=10', '-schema', ...]

    def handle_external_tool_process(process, cmd_args):
        out, err = process.communicate()
        if (out is not None) and isinstance(out, bytes):
            out = out.decode()
        if (err is not None) and isinstance(err, bytes):
            err = err.decode()
        rc = process.returncode
    
        if rc != 0:
>           raise ToolError(cmd_args, rc, out, err)
E           ccmlib.node.ToolError: Subprocess ['stress', 'write', 'n=1000000', '-rate', 'threads=10', '-schema', 'compaction(strategy=SizeTieredCompactionStrategy,enabled=false)'] exited with non-zero status; exit status: 1; 
E           stdout: -schema is defined multiple times. Each option/command can be specified at most once.

bhalevy avatar Jul 07 '24 10:07 bhalevy

Closing now that it's the default in cassandra-stress

bhalevy avatar Aug 18 '24 09:08 bhalevy