cassandra icon indicating copy to clipboard operation
cassandra copied to clipboard

CASSANDRA-19365: fix races in DecayingEstimatedHistogramReservoir

Open jakubzytka opened this issue 1 year ago • 2 comments

The patch introduces a DecayingBuckets class that encapsulates decayingBuckets and the now-immutable decayLandmark. DecayingBuckets are never rescaled in place, so we can now ensure no updates with wrong weight and no snapshots of a half-rescaled reservoir.

The patch retains the original update performance.

The patch contains a disabled (@Ignore) test demonstrating the update vs rescale race condition in practice.

jakubzytka avatar Feb 14 '24 12:02 jakubzytka

LGTM. The change doesn't incur performance penalty and should fix the correctness problem. Skipping a very tiny fraction of data is much smaller problem than inserting vastly incorrect data into histogram due to a race.

WRT performance. The results of LatencyTrackingBench have a lot of uncertainty, but if anything, they show a possible improvement in update throughput. It makes sense, as the number of volatile reads is reduced to 1 (using decayLandmark is no longer a volatile read)

jakubzytka avatar Feb 15 '24 13:02 jakubzytka

Can we also add a benchmark that emulates rescaling during updates? e.g. LatencyTrackingBench

Mmuzaf avatar Aug 13 '24 13:08 Mmuzaf