cassandra icon indicating copy to clipboard operation
cassandra copied to clipboard

Add LegacySSTableTest method testVerifyOldTupleSSTables for frozen tuples

Open michaelsembwever opened this issue 1 year ago • 10 comments

What is the issue

DSP-24600 Unreadable SSTables upgrading from DSE 5.1.x to DSE 6.8.36 and up

All the SSTables for the table that has the tuple (with something freezable inside it) is unreadable with newer versions. While it may report these SSTables as corrupt, they are not – as they have not been altered in any way. A rollback to the previous version will make them readable again. In an online upgrade we expect this problem to be identified when the first node has been upgraded, and definitely before the cluster is finished upgraded, and rolling back an in-progress upgrade is normal (though not ideal) operation.

What does this PR fix and why was it fixed

Adds unit test to reproduce the issue. Fix is @roxananeo's commit, formats me and earlier are marked as having implicitly frozen tuples.

Checklist before you submit for review

  • [ ] Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • [ ] Use NoSpamLogger for log lines that may appear frequently in the logs
  • [ ] Verify test results on Butler
  • [ ] Test coverage for new/modified code is > 80%
  • [ ] Proper code formatting
  • [ ] Proper title for each commit staring with the project-issue number, like CNDB-1234
  • [ ] Each commit has a meaningful description
  • [ ] Each commit is not very long and contains related changes
  • [ ] Renames, moves and reformatting are in distinct commits

michaelsembwever avatar Nov 26 '24 13:11 michaelsembwever

test failures do not look related: https://jenkins-stargazer.aws.dsinternal.org/job/ds-cassandra-pr-gate/view/change-requests/job/PR-1440/

( and butler does not appear to be working: http://butler-stargazer.aws.dsinternal.org/#/ci/upstream/compare/ds-cassandra-pr-gate/mck/DSP-24600/main )

michaelsembwever avatar Dec 02 '24 20:12 michaelsembwever

Will porting this to main-5.0 require special consideration?

djatnieks avatar Dec 06 '24 18:12 djatnieks

Will porting this to main-5.0 require special consideration?

No, main-5.0 only further raises online compatibility to big format na And for HCD, we're not initially planning to support upgrades from C* 3.x or DSE 5.1

Furthermore SSTableHeaderFix has been removed altogether there, so we're converging. I need to investigate if stuff in the tests warrant upstreaming… (I would say we should be using LegacySSTableTest a lot more! given how efficient it is at catching upgrade sstable issues)

michaelsembwever avatar Dec 06 '24 22:12 michaelsembwever

There's a few test failures here, so still wip…

michaelsembwever avatar Dec 06 '24 22:12 michaelsembwever

Quality Gate Failed Quality Gate failed

Failed conditions
41.7% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Dec 07 '24 08:12 sonarqubecloud[bot]

There's a few test failures here, so still wip…

most of the failures (maybe all) are fixed…

michaelsembwever avatar Dec 07 '24 13:12 michaelsembwever

@jacek-lewandowski , given jvm-dtest-upgrades don't work from 3.x to 4.0 (because of a lack of common jvm to run on), i'm removing your last commit: https://github.com/datastax/cassandra/pull/1440/commits/745a0f06a6941470319a51feaa3a92198ea0621b

i'm going to create the OSS patch, and use the test there instead.

michaelsembwever avatar Feb 25 '25 21:02 michaelsembwever

Quality Gate Failed Quality Gate failed

Failed conditions
61.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Mar 02 '25 23:03 sonarqubecloud[bot]

I cannot reproduce the BinLogTest failure locally. It is known to be flakey.

michaelsembwever avatar Mar 03 '25 14:03 michaelsembwever

Is this still planned?

Asking b/c I was hoping that when these changes made it to main-5.0 it would (help) fix the currently failing LegacySSTableTest.testVerifyOldDroppedTupleSSTables test mentioned in comment https://github.com/riptano/cndb/issues/14195#issuecomment-2902633193

djatnieks avatar Jul 01 '25 14:07 djatnieks

@djatnieks yes it is still planned. it is waiting for @jacek-lewandowski (who was on leave)

michaelsembwever avatar Jul 18 '25 08:07 michaelsembwever

force-pushed the mck/DSP-24600/main branch from ea85c04 to 5884d61

was a mistake.

force-pushed the mck/DSP-24600/main branch from 5884d61 to e81d639

returns this PR to what it was, plus a rebase off main

michaelsembwever avatar Aug 22 '25 16:08 michaelsembwever

:x: Build ds-cassandra-pr-gate/PR-1440 rejected by Butler


3 regressions found See build details here


Found 3 new test failures

Test Explanation Runs Upstream
o.a.c.db.compaction.unified.ShardedMultiWriterTest.testShardedCompactionWriter_threeShard[isReplicaAware=true] (compression) NEW :red_circle::large_blue_circle: 0 / 19
o.a.c.index.sai.cql.LuceneUpdateDeleteTest.testOverwriteWithTTL[ed] (compression) NEW :red_circle::large_blue_circle: 0 / 19
o.a.c.metrics.TrieMemtableMetricsTest.testContentionMetrics (compression) NEW :large_blue_circle::red_circle: 0 / 19

No known test failures found

cassci-bot avatar Aug 27 '25 09:08 cassci-bot

failures not related.

michaelsembwever avatar Aug 27 '25 10:08 michaelsembwever