semantic-metrics icon indicating copy to clipboard operation
semantic-metrics copied to clipboard

Race condition in LockFreeExponentiallyDecayingReservoir

Open 98amp2 opened this issue 5 years ago • 5 comments

While using the version 1.0.7 in a backend, we noticed this error crop up

java.util.NoSuchElementException: null
at java.util.concurrent.ConcurrentSkipListMap.firstKey (ConcurrentSkipListMap.java:1858)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.update (LockFreeExponentiallyDecayingReservoir.java:207)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.update (LockFreeExponentiallyDecayingReservoir.java:198)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.access$200 (LockFreeExponentiallyDecayingReservoir.java:160)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir.update (LockFreeExponentiallyDecayingReservoir.java:100)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir.update (LockFreeExponentiallyDecayingReservoir.java:88)
at com.spotify.metrics.core.ReservoirWithTtl.update (ReservoirWithTtl.java:123)
at com.codahale.metrics.Histogram.update (Histogram.java:41)

It looks like this can occur if the States ConcurrentSkipListMap is reset via a backfill in one thread at the same time an update on that state happens in another thread. Granted it's a small window but it looks to be possible with the current implementation.

98amp2 avatar Sep 10 '20 14:09 98amp2

hi @98amp2 thanks for reporting. We will looking into this issue.

malish8632 avatar Sep 10 '20 15:09 malish8632

Sorry for introducing this regression!

I think the best fix would be to switch to this implementation instead: https://github.com/dropwizard/metrics/pull/1656 We can either wait for it to be merged and released or we can inline it directly and replace this broken implementation.

@carterkozak would you be ok with us inlining that in this repo until it's released in dropwizard/metrics?

spkrka avatar Oct 12 '20 09:10 spkrka

@carterkozak would you be ok with us inlining that in this repo until it's released in dropwizard/metrics?

Of course :-) I’ve done the same here: https://github.com/palantir/tritium/blob/develop/tritium-registry/src/main/java/com/palantir/tritium/metrics/registry/LockFreeExponentiallyDecayingReservoir.java You could add a dependency on the tritium library if you prefer, however it may pull in additional transitive dependencies.

carterkozak avatar Oct 12 '20 12:10 carterkozak

@malish8632 I've made this PR now to address it: https://github.com/spotify/semantic-metrics/pull/87

spkrka avatar Oct 12 '20 13:10 spkrka

I think this can be closed now since #87 was merged?

mattnworb avatar Feb 11 '21 22:02 mattnworb