Julia Yakovlev
Julia Yakovlev
https://github.com/scylladb/scylla-bench/issues/30
longevity-twcs-3h-test, 2022.1.rc5 http://13.48.103.68/test_run/4f585541-af89-4da9-bd54-edf413637a9b 2 scylla-bench threads reported a lot of timeout errors in the time of running compactions: ``` 2022/05/22 16:59:18 gocql: no response received from cassandra within timeout period...
The issue is reproduced with @aleksbykov 's fix https://github.com/scylladb/scylla-cluster-tests/pull/4834 It happened after major_compaction nemesis:  Writes were stopped almost and reads got more resources Same error: ```...
@roydahan @aleksbykov I suggest I'll open the issue in Scylla. In both runs write load failed AFTER major compaction nemesis. I didn't find similar issue
@roydahan @aleksbykov I found now that write load was finished with status 0 in this time (that we think that it's problem) ``` < t:2022-05-22 18:51:57,373 f:base.py l:228 c:RemoteLibSSH2CmdRunner p:DEBUG...
Reproducer of twcs 3h test with @aleksbykov recommendations: db-cluster - https://cloudius-jenkins-test.s3.amazonaws.com/ad6a5407-459d-4141-904c-ddbaf51f70f1/20220607_113342/db-cluster-ad6a5407.tar.gz sct - https://cloudius-jenkins-test.s3.amazonaws.com/ad6a5407-459d-4141-904c-ddbaf51f70f1/20220607_113342/sct-runner-ad6a5407.tar.gz Job: https://jenkins.scylladb.com/job/scylla-staging/job/yulia/job/repr-longevity-twcs-3h-test/3/ I see that after write load finished, 2 read therads continue to run, but...
> @juliayakovlev I don't understand this part: > > > I see that after write load finished, 2 read therads continue to run, but no read load. > > How...
c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available
> > Reactor stalls (32ms) and kernel callstacks > > @juliayakovlev - where's the kernel stack? If do you mean original file - it's in the node logs (https://cloudius-jenkins-test.s3.amazonaws.com/a6bbb535-3cf6-4f8b-b742-40ef856170ea/20240512_082401/db-cluster-a6bbb535.tar.gz) If...
While `longevity-tls-50gb-3d-master-db-node-9adcc62d-5` decommissioning (`decommission_with_repair`), when repair completed, `longevity-tls-50gb-3d-master-db-node-9adcc62d-7` got segmentation fault and coredump. ``` 2024-05-14T04:33:13.489+00:00 longevity-tls-50gb-3d-master-db-node-9adcc62d-7 !INFO | scylla[19079]: Segmentation fault on shard 2. 2024-05-14T04:33:13.489+00:00 longevity-tls-50gb-3d-master-db-node-9adcc62d-7 !INFO | scylla[19079]: Backtrace:...
c-s load failed during cluster rolling restart - failed to get QUORUM, not enough replicas available
> @juliayakovlev - what encryption was configured here, btw? Client server? server server? both? both