kotlinx.coroutines
kotlinx.coroutines copied to clipboard
Sending to full Channel<Unit>(CONFLATED) is ~10x slower than sending to full Channel<Unit>(1)
Channel<Unit>(CONFLATED) is a pattern frequently used for signalling in coroutine-based environment. In cases when the channel already contains an element (e.g. receiver is delayed), calling send causes Channel to drop the previous element and replace it with the old one. This behavior is expected and matches documentation of Channel.CONFLATED, but is significantly slower than Channel<Unit>(1), which preserves the original object instead. In cases when the channel can only send/receive the same object (Unit === Unit), the default CONFLATED behavior could be optimized to be more efficient.
Running the benchmarks on Pixel 5:
// make sure we measure sending to already full channel
channel.trySend(Unit)
benchmarkRule.measureRepeated {
channel.trySend(Unit)
}
Produces the following results:
// Executed on Pixel 5 (Android 13)
652 ns ChannelBenchmark.conflatedChannelSend
70.8 ns ChannelBenchmark.normalChannelSend
559 ns ChannelBenchmark.conflatedChannelSend_empty
576 ns ChannelBenchmark.normalChannelSend_empty
From the results above, sending to empty channel is roughly the same for both Channel(CONFLATED) and Channel(1), but sending to already full channel is ~10x slower.
FTR, the benchmark:
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 5, time = 1)
@Fork(value = 1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
open class ChannelBenchmark {
/**
Old channels:
Benchmark Mode Cnt Score Error Units
ChannelBenchmark.send avgt 5 4.469 ± 0.085 us/op
ChannelBenchmark.trySend avgt 5 4.785 ± 0.050 us/op
New channels:
Benchmark Mode Cnt Score Error Units
ChannelBenchmark.send avgt 5 17.290 ± 0.116 us/op
ChannelBenchmark.trySend avgt 5 2.752 ± 0.036 us/op
*/
private val conflated = Channel<Unit>(Channel.CONFLATED)
private val buffered = Channel<Unit>(1)
@Benchmark
fun trySend() = runBlocking {
repeat(1000) {
buffered.trySend(Unit)
}
}
@Benchmark
fun send() = runBlocking {
repeat(1000) {
conflated.send(Unit)
}
}
}
The root cause is clear (in fact, it is properly highlighted in the original report), but the fix is not -- in order to proceed with that, we should repeat what we did in old channels -- provide a dedicated "conflated; buffer = 1" implementation of the channel, which is an additional maintenance burden and a contribution to the library size.
Could you please elaborate on how frequent/often the use-case "conflated channel of Unit to notify the other party"? How performance-sensitive is it?
This is happening quite often, e.g. we were using this pattern for delaying some work until the end of frame on each state write (see workaround). I'd argue this is quite performance-sensitive, since send can happen on a hotpath quite frequently.