scylladb icon indicating copy to clipboard operation
scylladb copied to clipboard

[x86_64, debug] topology_custom/test_topology_failure_recovery failed with ConfigurationException

Open scylladb-promoter opened this issue 1 year ago • 24 comments

Although the test fails with ConfigurationException, the real error is that the node on which the drop keyspace test is executed on, crashes, due to on_internal_error - compaction not disabled on table test.test during TRUNCATE, which causes the driver to re-try the statement on another node. Whether this is a bug or not is a good question, but this issue will be dedicated to the truncate problem. The truncate error happens because a tablet migration races with truncate, see https://github.com/scylladb/scylladb/issues/20060#issuecomment-2310220041

I will leave the title unchanged because it makes this issue easy to find, for people wanting to annotate CI failures.


https://jenkins.scylladb.com/job/scylla-master/job/next/8046/ failed with the following error:


=================================== FAILURES ===================================
_______________ test_tablet_drain_failure_during_decommission.4 ________________

manager = <test.pylib.manager_client.ManagerClient object at 0x7f2a607b0560>

    @pytest.mark.asyncio
    @skip_mode('release', 'error injections are not supported in release mode')
    async def test_tablet_drain_failure_during_decommission(manager: ManagerClient):
        cfg = {'enable_user_defined_functions': False, 'enable_tablets': True}
        servers = [await manager.server_add(config=cfg) for _ in range(3)]
    
        logs = [await manager.server_open_log(srv.server_id) for srv in servers]
        marks = [await log.mark() for log in logs]
    
        cql = manager.get_cql()
        await cql.run_async("CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1} AND tablets = {'initial': 32};")
        await cql.run_async("CREATE TABLE test.test (pk int PRIMARY KEY, c int);")
    
        logger.info("Populating table")
    
        keys = range(256)
        await asyncio.gather(*[cql.run_async(f"INSERT INTO test.test (pk, c) VALUES ({k}, {k});") for k in keys])
    
        await inject_error_on(manager, "stream_tablet_fail_on_drain", servers)
    
        await manager.decommission_node(servers[2].server_id, expected_error="Decommission failed. See earlier errors")
    
        matches = [await log.grep("raft_topology - rollback.*after decommissioning failure, moving transition state to rollback to normal",
                   from_mark=mark) for log, mark in zip(logs, marks)]
        assert sum(len(x) for x in matches) == 1
    
>       await cql.run_async("DROP KEYSPACE test;")
E       cassandra.protocol.ConfigurationException: <Error from server: code=2300 [Query invalid because of configuration issue] message="Cannot drop non existing keyspace 'test'.">

test/topology_custom/test_topology_failure_recovery.py:46: ConfigurationException
------------------------------ Captured log setup ------------------------------

scylladb-promoter avatar Aug 07 '24 13:08 scylladb-promoter

Dup of https://github.com/scylladb/scylladb/issues/19978 ?

mykaul avatar Aug 08 '24 10:08 mykaul

Dup of #19978 ?

i think so.

tchaikov avatar Aug 15 '24 05:08 tchaikov

Dup of #19978 ?

i think so.

This is not a duplicate of #19978, that one is related to truncate, here we fail to drop the test keyspace, probably because we somehow attempt to drop it more than once.

denesb avatar Aug 21 '24 06:08 denesb

Seen again in https://jenkins.scylladb.com/job/scylla-6.1/job/scylla-ci/35/testReport/junit/(root)/test_topology_failure_recovery/Tests___Unit_Tests___test_tablet_drain_failure_during_decommission_dev_6/.

test_tablet_drain_failure_during_decommission.6.zip

denesb avatar Aug 21 '24 07:08 denesb

seen again on https://jenkins.scylladb.com/job/scylla-6.1/job/next/64/testReport/junit/(root)/test_topology_failure_recovery/Build___aarch64___Unit_Tests_aarch64___test_tablet_drain_failure_during_decommission_dev_2/

Annamikhlin avatar Aug 25 '24 09:08 Annamikhlin

I'm investigating this based on https://github.com/scylladb/scylladb/issues/20060#issuecomment-2301282313.

Just as I suspected, we apparently attempt to drop the keyspace twice. We can see the successful (first) attemp here (node 2714):

937  INFO  2024-08-19 17:01:45,497 [shard 0:stmt] migration_manager - Drop Keyspace 'test'                                                                                                                                                                     
[...]
947  INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping keyspace test                                                                                                                                                                       
   1 INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping test.test id=912181d0-5e33-11ef-835e-b9cdd1dc93ca version=912181d1-5e33-11ef-835e-b9cdd1dc93ca                                                                                      
   2 INFO  2024-08-19 17:01:45,499 [shard 0: gms] database - Dropping test.test without snapshot

Other nodes also receive the update and execute the keyspace and table drop. node-2707:

1652 INFO  2024-08-19 17:01:45,498 [shard 0: gms] schema_tables - Dropping keyspace test                                                                                                                                                                       
   1 INFO  2024-08-19 17:01:45,498 [shard 0: gms] schema_tables - Dropping test.test id=912181d0-5e33-11ef-835e-b9cdd1dc93ca version=912181d1-5e33-11ef-835e-b9cdd1dc93ca                                                                                      
   2 INFO  2024-08-19 17:01:45,498 [shard 0: gms] database - Dropping test.test without snapshot                                                                                                                                                               
   3 INFO  2024-08-19 17:01:45,498 [shard 0: gms] database - Truncating test.test without snapshot  

node-2708:

996  INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping keyspace test                                                                                                                                                                       
   1 INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping test.test id=912181d0-5e33-11ef-835e-b9cdd1dc93ca version=912181d1-5e33-11ef-835e-b9cdd1dc93ca                                                                                      
   2 INFO  2024-08-19 17:01:45,499 [shard 0: gms] database - Dropping test.test without snapshot                                                                                                                                                               
   3 INFO  2024-08-19 17:01:45,499 [shard 0: gms] database - Truncating test.test without snapshot  

The driver is notified as well:

177 17:01:45.512 DEBUG> Message pushed from server: <EventMessage(event_type='SCHEMA_CHANGE', event_args={'target_type': 'TABLE', 'change_type': 'DROPPED', 'keyspace': 'test', 'table': 'test'}, stream_id=-1, trace_id=None)>                                
  1 17:01:45.512 DEBUG> Message pushed from server: <EventMessage(event_type='SCHEMA_CHANGE', event_args={'target_type': 'KEYSPACE', 'change_type': 'DROPPED', 'keyspace': 'test'}, stream_id=-1, trace_id=None)>  

Despite all this, apparently there is a second attempt to drop this keyspace. The exception we see on the driver side is:

29  >       await cql.run_async("DROP KEYSPACE test;")                                                                                                                                                                                                         
  1 E       cassandra.protocol.ConfigurationException: <Error from server: code=2300 [Query invalid because of configuration issue] message="Cannot drop non existing keyspace 'test'.">                                                                       
  2                                                                                                                                                                                                                                                            
  3 test/topology_custom/test_topology_failure_recovery.py:46: ConfigurationException

The exception comes from service/migration_manager.cc: prepare_keyspace_drop_announcement(), which is called from the keyspace drop CQL statement.

On the driver side, I can see this, right after the keyspace/table drop events:

179 17:01:45.699 DEBUG> Connection <CustomConnection(140256467195648) 127.74.236.29:9042> closed by server                                                                                                                                                     
  1 17:01:45.699 DEBUG> Closing connection (140256467195648) to 127.74.236.29:9042                                                                                                                                                                             
  2 17:01:45.699 DEBUG> Closed socket to 127.74.236.29:9042                                                                                                                                                                                                    
  3 17:01:45.699 DEBUG> Defunct or closed connection (140256467195648) returned to pool, potentially marking host 127.74.236.29:9042 as down                                                                                                                   
  4 17:01:45.699 DEBUG> Shutting down connections to 127.74.236.29:9042                                                                                                                                                                                        
  5 17:01:45.700 DEBUG> Closing connection (140256476855184) to 127.74.236.29:9042                                                                                                                                                                             
  6 17:01:45.700 DEBUG> Closing connection (140256476855184) to 127.74.236.29:9042                                                                                                                                                                             
  7 17:01:45.700 DEBUG> Closed socket to 127.74.236.29:9042                                                                                                                                                                                                    
  8 17:01:45.700 DEBUG> Closing connection (140256467195648) to 127.74.236.29:9042                                                                                                                                                                             
  9 17:01:45.700 DEBUG> Closing pending connection (140256467195648) to 127.74.236.29:9042                                                                                                                                                                     
 10 17:01:45.700 WARNING> Host 127.74.236.29:9042 has been marked down                                                                                                                                                                                         
 11 17:01:45.700 DEBUG> Removed connection pool for <Host: 127.74.236.29:9042 datacenter1>                                                                                                                                                                     
 12 17:01:45.700 DEBUG> Starting reconnector for host 127.74.236.29:9042                                                                                                                                                                                       
 13 17:01:45.700 DEBUG> Shutting down connections to 127.74.236.29:9042 

Looks like the server closed the connection. Indeed (facepalm), node 2714 crashes with on_internal_error():

966  ERROR 2024-08-19 17:01:45,507 [shard 0: gms] on_internal_error - compaction not disabled on table test.test during TRUNCATE, at: /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x54f6ce /jenkins/workspace/scylla-6.1/scy
     lla-ci/scylla/build/dev/seastar/libseastar.so+0x54fcf0 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x54fff8 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x39dde7 0x1601310 0x137e48d      0x12dad5e 0x12f6e31 0x1042530 0x10423e6 0x10422fa 0x104224b /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fc7d /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fbdf 0x10421a9 0x1     30df6a 0x1012c9a /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3e923f /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3ea5fa /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/     seastar/libseastar.so+0x3eb697 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3eab98 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x35e8a6 /jenkins/workspace/scylla-6.1/scylla-ci/scy     lla/build/dev/seastar/libseastar.so+0x35dc89 0xfb8401 0xff5fe0 0xfb71af /lib64/libc.so.6+0x2a087 /lib64/libc.so.6+0x2a14a 0xfb5ca4                                                                                                                        
   1    --------                                                                                                                                                                                                                                               
   2    seastar::internal::coroutine_traits_base<void>::promise_type                                                                                                                                                                                           
   3    --------                                                                                                                                                                                                                                               
   4    seastar::internal::coroutine_traits_base<void>::promise_type                                                                                                                                                                                           
   5    --------                                                                                                                                                                                                                                               
   6    seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::noncopyable_function<se
     astar::future<void> (seastar::future<void>&&)> >(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::noncopyable_function<seastar::future<void> (seasta     r::future<void>&&)>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>                                                                                                                                                                    
   7 Aborting on shard 0, in scheduling group gossip.                                                                                                                                                                                                          
   8 Backtrace:                                                                                                                                                                                                                                                
   9   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x412864                                                                                                                                                                 
  10   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3dbca5                                                                                                                                                                 
  11   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x406ac6                                                                                                                                                                 
  12   /lib64/libc.so.6+0x40cff                                                                                                                                                                                                                                
  13   /lib64/libc.so.6+0x994a3                                                                                                                                                                                                                                
  14   /lib64/libc.so.6+0x40c4d                                                                                                                                                                                                                                
  15   /lib64/libc.so.6+0x28901                                                                                                                                                                                                                                
  16   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x39de6a                                                                                                                                                                 
  17   0x1601310                                                                                                                                                                                                                                               
  18   0x137e48d                                                                                                                                                                                                                                               
  19   0x12dad5e                                                                                                                                                                                                                                               
  20   0x12f6e31                                                                                                                                                                                                                                               
  21   0x1042530                                                                                                                                                                                                                                               
  22   0x10423e6                                                                                                                                                                                                                                               
  23   0x10422fa                                                                                                                                                                                                                                               
  24   0x104224b                                                                                                                                                                                                                                               
  25   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fc7d                                                                                                                                                                 
  26   /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fbdf

So this is a duplicate of https://github.com/scylladb/scylladb/issues/19978 after all. That said, the driver re-sending the keyspace drop does smell like a bug to me.

denesb avatar Aug 26 '24 13:08 denesb

Looking at where this error is coming from (replica/table.cc):

2759 future<db::replay_position> table::discard_sstables(db_clock::time_point truncated_at) {                                                                                                                                                                  
   1     // truncate_table_on_all_shards() disables compaction for the truncated                                                                                                                                                                               
   2     // tables and views, so we normally expect compaction to be disabled on                                                                                                                                                                               
   3     // this table. But as shown in issue #17543, it is possible that a new                                                                                                                                                                                
   4     // materialized view was created right after truncation started, and it                                                                                                                                                                               
   5     // would not have compaction disabled when this function is called on it.                                                                                                                                                                             
   6     if (!schema()->is_view()) {                                                                                                                                                                                                                           
   7         auto compaction_disabled = std::ranges::all_of(storage_groups() | boost::adaptors::map_values,                                                                                                                                                    
   8                                                        std::mem_fn(&storage_group::compaction_disabled));                                                                                                                                                 
   9         if (!compaction_disabled) {                                                                                                                                                                                                                       
  10             utils::on_internal_error(fmt::format("compaction not disabled on table {}.{} during TRUNCATE",                                                                                                                                                
  11                 schema()->ks_name(), schema()->cf_name()));                                                                                                                                                                                               
  12         }                                                                                                                                                                                                                                                 
  13     } 

We explode because at least one storage group has compaction enabled, when discard_sstables() is called. This method is called from replica::database::truncate(), which is the last thing we do in database::truncate_table_on_all_shards(). Earlier, we disabled compaction on all storage groups:

2471             co_await cf.parallel_foreach_table_state([&cm, &st] (compaction::table_state& ts) -> future<> {                                                                                                                                               
   1                 st->cres.emplace_back(co_await cm.stop_and_disable_compaction(ts));                                                                                                                                                                       
   2             }); 

However, looking at the logs, we can see that there is a tablet migration going on, concurrent to the truncate:

947  INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping keyspace test
   1 INFO  2024-08-19 17:01:45,499 [shard 0: gms] schema_tables - Dropping test.test id=912181d0-5e33-11ef-835e-b9cdd1dc93ca version=912181d1-5e33-11ef-835e-b9cdd1dc93ca
   2 INFO  2024-08-19 17:01:45,499 [shard 0: gms] database - Dropping test.test without snapshot
   3 INFO  2024-08-19 17:01:45,499 [shard 1:strm] stream_session - [Stream #91d7b271-5e33-11ef-aa06-a6598c27c17c] Write to sstable for ks=test, cf=test, estimated_partitions=128, received_partitions=4
   4 INFO  2024-08-19 17:01:45,499 [shard 0:strm] stream_session - [Stream #91d7b271-5e33-11ef-aa06-a6598c27c17c] Streaming plan for Tablet migration-test-index-0 succeeded, peers={127.74.236.14}, tx=0 KiB, 0.00 KiB/s, rx=0 KiB, 197.10 KiB/s
   5 INFO  2024-08-19 17:01:45,499 [shard 0:strm] range_streamer - Finished 1 out of 1 ranges for tablet migration, finished percentage=1
   6 INFO  2024-08-19 17:01:45,499 [shard 0:strm] range_streamer - Tablet migration with 127.74.236.14 for keyspace=test succeeded, took 0.00415845 seconds
   7 INFO  2024-08-19 17:01:45,499 [shard 0:strm] range_streamer - Tablet migration succeeded, took 0 seconds, nr_ranges_remaining=0
   8 WARN  2024-08-19 17:01:45,499 [shard 0:strm] raft_topology - Streaming for tablet migration of 912181d0-5e33-11ef-835e-b9cdd1dc93ca:22 failed: data_dictionary::no_such_column_family (Can't find a column family with UUID 912181d0-5e33-11ef-835e-b9cdd1
     dc93ca)
   9 INFO  2024-08-19 17:01:45,500 [shard 0:strm] sstable - Rebuilding bloom filter /scylladir/testlog/x86_64/dev/scylla-2714/data/test/test-912181d05e3311ef835eb9cdd1dc93ca/me-3git_12yx_2yb5s2l2yrz6lzwjgc-big-Filter.db: resizing bitset from 168 bytes to 16 bytes. sstable origin: streaming10 INFO  2024-08-19 17:01:45,504 [shard 0:comp] compaction - [Compact system.cdc_local 91d78b60-5e33-11ef-aa06-a6598c27c17c] Compacted 3 sstables to [/scylladir/testlog/x86_64/dev/scylla-2714/data/system/cdc_local-0bcaffd40c833eadad13dc1d5015b77c/me-3git_12yx_2xvq92l2yrz6lzwjgc-big-Data.db:level=0]. 16kB to 5345 bytes (~33% of original) in 7ms = 2MB/s. ~384 total partitions merged to 1.
  11 INFO  2024-08-19 17:01:45,504 [shard 0:comp] compaction - [Compact system.tablets 91d91200-5e33-11ef-aa06-a6598c27c17c] Compacting [/scylladir/testlog/x86_64/dev/scylla-2714/data/system/tablets-fd4f7a4696bd3e7391bf99eb77e82a5c/me-3git_12yx_2dipc2l2yrz6lzwjgc-big-Data.db:level=0:origin=memtable,/scylladir/testlog/x86_64/dev/scylla-2714/data/system/tablets-fd4f7a4696bd3e7391bf99eb77e82a5c/me-3git_12yx_2d39s2l2yrz6lzwjgc-big-Data.db:level=0:origin=memtable,/scylladir/testlog/x86_64/dev/scylla-2714/data/system/tablets-fd4f7a4696bd3e7391bf99eb77e82a5c/me-3git_12yx_2e5uq2l2yrz6lzwjgc-big-Data.db:level=0:origin=memtable,/scylladir/testlog/x86_64/dev/scylla-2714/data/system/tablets-fd4f7a4696bd3e7391bf99eb77e82a5c/me-3git_12yx_2gb0g2l2yrz6lzwjgc-big-Data.db:level=0:origin=memtable,/scylladir/testlog/x86_64/dev/scylla-2714/data/system/tablets-fd4f7a4696bd3e7391bf99eb77e82a5c/me-3git_12yw_1tde82l2yrz6lzwjgc-big-Data.db:level=0:origin=memtable]
  12 INFO  2024-08-19 17:01:45,506 [shard 1:strm] stream_session - [Stream #91d7b270-5e33-11ef-aa06-a6598c27c17c] Write to sstable for ks=test, cf=test, estimated_partitions=128, received_partitions=8
  13 INFO  2024-08-19 17:01:45,506 [shard 0: gms] database - Truncating test.test without snapshot
  14 INFO  2024-08-19 17:01:45,506 [shard 0:strm] stream_session - [Stream #91d7b270-5e33-11ef-aa06-a6598c27c17c] Streaming plan for Tablet migration-test-index-0 succeeded, peers={127.74.236.17}, tx=0 KiB, 0.00 KiB/s, rx=1 KiB, 140.52 KiB/s
  15 INFO  2024-08-19 17:01:45,506 [shard 0:strm] range_streamer - Finished 1 out of 1 ranges for tablet migration, finished percentage=1
  16 INFO  2024-08-19 17:01:45,506 [shard 0:strm] range_streamer - Tablet migration with 127.74.236.17 for keyspace=test succeeded, took 0.011757634 seconds
  17 INFO  2024-08-19 17:01:45,506 [shard 0:strm] range_streamer - Tablet migration succeeded, took 0 seconds, nr_ranges_remaining=0
  18 WARN  2024-08-19 17:01:45,507 [shard 0:strm] raft_topology - Streaming for tablet migration of 912181d0-5e33-11ef-835e-b9cdd1dc93ca:27 failed: data_dictionary::no_such_column_family (Can't find a column family with UUID 912181d0-5e33-11ef-835e-b9cdd1dc93ca)
  19 ERROR 2024-08-19 17:01:45,507 [shard 0: gms] on_internal_error - compaction not disabled on table test.test during TRUNCATE, at: /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x54f6ce /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x54fcf0 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x54fff8 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x39dde7 0x1601310 0x137e48d      0x12dad5e 0x12f6e31 0x1042530 0x10423e6 0x10422fa 0x104224b /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fc7d /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x43fbdf 0x10421a9 0x130df6a 0x1012c9a /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3e923f /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3ea5fa /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/     seastar/libseastar.so+0x3eb697 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x3eab98 /jenkins/workspace/scylla-6.1/scylla-ci/scylla/build/dev/seastar/libseastar.so+0x35e8a6 /jenkins/workspace/scylla-6.1/scylla-ci/scy     lla/build/dev/seastar/libseastar.so+0x35dc89 0xfb8401 0xff5fe0 0xfb71af /lib64/libc.so.6+0x2a087 /lib64/libc.so.6+0x2a14a 0xfb5ca4

Apparently the migration finished, between the call to stop compaction for all storage groups, and the call to database::truncate(). The new storage group which belongs to the migrated-in tablet, has compaction enabled.

@bhalevy

denesb avatar Aug 26 '24 13:08 denesb

https://github.com/scylladb/scylladb/pull/19789 should fix this race, so I'm assigning this issue to @nodep.

denesb avatar Aug 27 '24 03:08 denesb

Seen again https://jenkins.scylladb.com/job/scylla-6.0/job/next/210/

yaronkaikov avatar Sep 01 '24 06:09 yaronkaikov

@nodep - what's the status of this issue? It's hitting us hard in CI.

mykaul avatar Sep 10 '24 12:09 mykaul

@mykaul The PR #19789 and the design for making TRUNCATE TABLE safe with tablets was rejected. I am currently implementing TRUNCATE as a topology operation. It will not be ready for a few more weeks.

nodep avatar Sep 10 '24 12:09 nodep

@mykaul The PR #19789 and the design for making TRUNCATE TABLE safe with tablets was rejected. I am currently implementing TRUNCATE as a topology operation. It will not be ready for a few more weeks.

I don't think it will solve this issue, unless you also make "DROP KEYSPACE" a topology operation. Here, truncate is initiated locally in reaction to processing schema change locally.

Maybe the race could be solved in the table layer, by reordering truncation and stopping so that we first stop and then truncate. Stopping will cause migration to fail which will then notice that table is dropped and won't be retried.

tgrabiec avatar Sep 17 '24 15:09 tgrabiec

seen again on https://jenkins.scylladb.com/job/scylla-6.0/job/next/233/testReport/junit/(root)/test_topology_failure_recovery/Build___x86___Unit_Tests_x86___test_tablet_drain_failure_during_decommission_2/

Annamikhlin avatar Sep 18 '24 07:09 Annamikhlin

Seen again on https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/11694/.

denesb avatar Sep 19 '24 10:09 denesb

Wow, the bot likes opening issues Screenshot from 2024-09-27 12-38-11

kbr-scylla avatar Sep 27 '24 10:09 kbr-scylla

Seen again in https://jenkins.scylladb.com//job/scylla-master/job/scylla-ci/11956/testReport/junit/%28root%29/test_topology_failure_recovery/Tests___Unit_Tests___test_tablet_drain_failure_during_decommission_dev_5.

denesb avatar Sep 30 '24 08:09 denesb

Hit in https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/12017/testReport/junit/(root)/test_topology_failure_recovery/test_tablet_drain_failure_during_decommission_dev_2/

bhalevy avatar Sep 30 '24 11:09 bhalevy

https://github.com/scylladb/scylladb/pull/19789 should fix this race, so I'm assigning this issue to @nodep.

@denesb @paszkow @nodep: as @tgrabiec mentioned to me (could be related to a different issue, I think with the failed assert regarding replay position), the truncate that happens here is invoked internally, in https://github.com/scylladb/scylladb/blob/5a470b2bfbe6fdb371bbde2a2f8987d6618dd265/replica/database.cc#L1037

So https://github.com/scylladb/scylladb/pull/19789 won't help much IIUC, since it deals with CQL TRUNCATE, not with this path.

bhalevy avatar Sep 30 '24 11:09 bhalevy

We may want to serialize DROP TABLE / KEYSPACE with load balancing in a similar way.

bhalevy avatar Sep 30 '24 11:09 bhalevy

Seen again:

https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/12026/testReport/junit/(root)/test_topology_failure_recovery/Tests___Unit_Tests___test_tablet_drain_failure_during_decommission_dev_4/

https://github.com/scylladb/scylladb/pull/20883

avikivity avatar Sep 30 '24 14:09 avikivity

@mykaul The PR #19789 and the design for making TRUNCATE TABLE safe with tablets was rejected. I am currently implementing TRUNCATE as a topology operation. It will not be ready for a few more weeks.

I don't think it will solve this issue, unless you also make "DROP KEYSPACE" a topology operation. Here, truncate is initiated locally in reaction to processing schema change locally.

Maybe the race could be solved in the table layer, by reordering truncation and stopping so that we first stop and then truncate. Stopping will cause migration to fail which will then notice that table is dropped and won't be retried.

Good idea.

https://github.com/scylladb/scylladb/pull/20910 should fix this issue by stopping the table inside truncate_on_all_shards, after it enters the table's async_gate and before it disables compaction (and it doesn't need to after stopping all compaction groups).

It also changes the internal error to verify that the truncated storage groups are either stopped or compaction is disabled on them.

bhalevy avatar Oct 01 '24 09:10 bhalevy

seen also on https://jenkins.scylladb.com/job/scylla-enterprise/job/next/3183/testReport/junit/(root)/test_tablets/Build___x86___Unit_Tests_x86___test_topology_changes_dev_5/

Annamikhlin avatar Oct 20 '24 07:10 Annamikhlin

Apparently, the root cause for this issue is that cleanup_tablet is not synchronized with erm changes, and it calls:

    co_await stop_compaction_groups(sg);

Which in turn, calls storage_group::stop -> compaction_group::stop, and in turn

    co_await _t._compaction_manager.remove(as_table_state(), reason);
future<> compaction_manager::remove(table_state& t, sstring reason) noexcept {
    auto& c_state = get_compaction_state(&t);
    auto erase_state = defer([&t, &c_state, this] () noexcept {
       c_state.backlog_tracker->disable();
       _compaction_state.erase(&t);
    });

After _compaction_state.erase(&t):

bool compaction_manager::compaction_disabled(table_state& t) const {
    return _compaction_state.contains(&t) && _compaction_state.at(&t).compaction_disabled();
}

compaction_disabled erroneously returns false, though if the compaction group was removed from _compaction_state compaction is definitely not enabled. Anyhow, this appears to trigger the internal error in:

future<db::replay_position> table::discard_sstables(db_clock::time_point truncated_at) {
...
        auto compaction_disabled = std::ranges::all_of(storage_groups() | std::views::values,
                                                       std::mem_fn(&storage_group::compaction_disabled));
        if (!compaction_disabled) {
            utils::on_internal_error(fmt::format("compaction not disabled on table {}.{} during TRUNCATE",
                schema()->ks_name(), schema()->cf_name()));
        }

bhalevy avatar Oct 28 '24 11:10 bhalevy

failed on the 2024.2 https://jenkins.scylladb.com/job/enterprise-2024.2/job/next/177/testReport/junit/(root)/test_topology_failure_recovery/Build___aarch64___Unit_Tests_aarch64___test_tablet_drain_failure_during_decommission/ - need backport to 2024.2

Annamikhlin avatar Nov 06 '24 08:11 Annamikhlin