RxJava
RxJava copied to clipboard
`Flowable#groupBy` race leads to a back-pressure issue
Hi!
While debugging https://github.com/reactor/reactor-core/issues/2352 we wanted to check whether RxJava has the same issue since, given the history of both projects :)
Apparently, with 3.0.7
, the same construction in RxJava fails with a very similar issue (although the failure is different):
final int total = 100;
Long count = Flowable.range(0, total)
.groupBy(i -> (i / 2) * 2)
.flatMapMaybe(Flowable::firstElement, false, 1)
.observeOn(Schedulers.io())
.count()
.blockingGet();
assertThat(total - count).as("count").isZero();
Gives (not 100% reliably, consider running in "rerun until failure" mode):
io.reactivex.rxjava3.exceptions.MissingBackpressureException: Unable to emit a new group (#97) due to lack of requests. Please make sure the downstream can always accept a new group as well as each group is consumed in order for the whole operator to be able to proceed.
at io.reactivex.rxjava3.internal.operators.flowable.FlowableGroupBy$GroupBySubscriber.onNext(FlowableGroupBy.java:197)
A few interesting observations:
- Changing
observeOn
's buffer size to131
and higher makes it always pass -
130
would sometimes fail withUnable to emit a new group (#99) due to lack of requests
-
129
would sometimes fail withUnable to emit a new group (#98) due to lack of requests
-
128
would sometimes fail withUnable to emit a new group (#97) due to lack of requests
- etc etc
So it looks like there is a race between cancellation of the group and starting a new one, although I haven't investigated RxJava's issue much.
There is no ordering between group emission and cancellation and the JavaDocs warns about MBEs due to lack of requesting more groups. http://reactivex.io/RxJava/3.x/javadoc/io/reactivex/rxjava3/core/Flowable.html#groupBy-io.reactivex.rxjava3.functions.Function-
Backpressure: The consumer of the returned Flowable has to be ready to receive new GroupedFlowables or else this operator will signal MissingBackpressureException. To avoid this exception, make sure a combining operator (such as flatMap) has adequate amount of buffering/prefetch configured.
In the example, the observeOn
introduces just enough asynchrony via its requesting that the range
ends up creating more groups for which flatMapMaybe
is not ready.
It's more complicated than that.
The request/delivery in the inner groups still feed back to the main source because they have to trigger progress if they become ready. Consequently, since the next upstream value may be creating a new group or would go to an existing group, the main group output may hold back the inner groups. You could buffer these new groups but then they can't get consumed and can clog up the main queue.
Yeah. Noticed that. I guess I found another fix. Let me try it quickly for the Reactor to see if it works. Apologies for confusing with the first message
Fix -> https://github.com/reactor/reactor-core/pull/2450