spring-data-redis HashOperations MultiGet Probably shouldn't return null elements in the collection

Hi,

There was an issue raised a while ago which supported null values in the collection that is returned from the ReactiveHashOperations multiGet method. Whilst this fixes the original NPE, it creates other NPEs for clients using this code. For example, if the client code translates the returned collection to a Flux using Flux.fromIterable this will throw an NPE. When this happens the NPE that is throw is quite tricky to isolate as the stack trace that comes out often has no reference to the client code for example:

j.l.NullPointerException: The iterator returned a null value at java.util.Objects.requireNonNull(Unknown Source) at r.c.p.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:254) at r.c.p.FluxIterable$IterableSubscription.request(FluxIterable.java:225) at r.c.p.MonoFlatMapMany$FlatMapManyMain.onSubscribeInner(MonoFlatMapMany.java:143) at r.c.p.MonoFlatMapMany$FlatMapManyInner.onSubscribe(MonoFlatMapMany.java:237) at r.c.p.FluxIterable.subscribe(FluxIterable.java:161) at r.c.p.FluxIterable.subscribe(FluxIterable.java:86) at r.c.publisher.Flux.subscribe(Flux.java:8361) at r.c.p.MonoFlatMapMany$FlatMapManyMain.onNext(MonoFlatMapMany.java:188) at r.c.p.FluxMap$MapSubscriber.onNext(FluxMap.java:114) at r.c.p.MonoNext$NextSubscriber.onNext(MonoNext.java:76) at r.c.p.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:73) at r.c.p.MonoFlatMapMany$FlatMapManyInner.onNext(MonoFlatMapMany.java:242) at r.c.p.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:274) at r.c.p.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:851) at r.c.p.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:121) at r.c.p.Operators$MonoSubscriber.complete(Operators.java:1812) at r.c.p.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:121) at r.c.p.MonoFlatMapMany$FlatMapManyInner.onComplete(MonoFlatMapMany.java:252) at i.l.c.RedisPublisher$OnComplete.run(RedisPublisher.java:1052) at i.n.u.c.DefaultEventExecutor.run(DefaultEventExecutor.java:66) at i.n.u.c.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at i.n.u.i.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at i.n.u.c.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Unknown Source)

There is a test which covers the returning of null values in the collection, which was created from the referenced issue. So I'm not sure if this behaviour is intentional.

	@ParameterizedRedisTest // DATAREDIS-824
	void multiGetAbsentKeys() {

		assumeThat(hashKeyFactory instanceof StringObjectFactory && hashValueFactory instanceof StringObjectFactory)
				.isTrue();

		hashOperations.multiGet(keyFactory.instance(), Arrays.asList(hashKeyFactory.instance(), hashKeyFactory.instance()))
				.as(StepVerifier::create) //
				.consumeNextWith(actual -> {
					assertThat(actual).hasSize(2).containsSequence(null, null);
				}) //
				.verifyComplete();
	}

I can't think of any good reason why we would want to return null in the collection (very happy to hear arguments in favour though). This behaviour is, I think, inconsistent with get() which would just return an empty publisher if a value wasn't found.

If there is a reason for keeping null, then I think it might be worthwhile adding a method like multiGetNotNull which could return either a Flux<T> or Mono<Collection<T> so clients can avoid this behaviour.

Happy to put in a PR if the general consensus is to do this.

Thanks

Shane

Apr 22 '22 18:04 ShaneLee

Consider the following Redis key arrangement:

Hash: foo
Fields:

one: 1
two: 2
four: 4

If you issue a command HMGET foo zero two three Redis responds with:

(nil) 2 (nil)

If we would filter out absent elements, the resulting collection would look like: [2] instead of [null, 2, null]. There would be no way to know which fields were present or absent.

Does that make sense to you?

In contrast to HGET, the HGET command returns a single response, so the correlation between the requested hash field and the response is much simpler.

Aug 08 '22 07:08 mp911de

Yes that makes sense, however it feels clunky for two reasons. As mentioned calling Flux.fromIterable from the result of multiGet leads to NPEs which are very difficult to diagnose. The second reason is that it's not clear that clients can expect the behaviour that you describe, i.e I pass it an ordered collection and it returns an ordered collection which I can then map by index to the original collection. If this was made clearer, it still seems suboptimal from a client's perspective.

I think one of the solutions is to add mulitGetNonNull which would return the [2] in your example.

Another possible solution is to make multiGet return a Map as opposed to a list as this would fit better with the example you describe.

Aug 08 '22 08:08 ShaneLee

As mentioned calling Flux.fromIterable

Indeed, that isn't going to work. We could improve our documentation to mention that you should expect absent object markers in the form of null values. The signature already indicates a return type that doesn't follow the typical reactive patterns as most multi-responses return a Flux, not Mono.

Aug 08 '22 08:08 mp911de