spring-data-mongodb icon indicating copy to clipboard operation
spring-data-mongodb copied to clipboard

Investigate reactive lazy DBRef resolution [DATAMONGO-1583]

Open spring-projects-issues opened this issue 8 years ago • 2 comments

Mark Paluch opened DATAMONGO-1583 and commented

We should investigate on a way to resolve Mongo database references (@DBRef) lazily retaining reactive semantics.

Currently, lazy @DBRef resolution incurs blocking data access as it retrieves the underlying reference(s) when accessing methods of the reference.

Applying the same resolution easily leads to unwanted, blocking access. A reactive reference type should reflect

  • reactivity
  • multiplicity
  • possibly access to the underlying reference Id's

Reactive @DBRef's should allow multiple reads of the referenced objects


Affects: 2.0 M1 (Kay)

Issue Links:

spring-projects-issues avatar Jan 09 '17 08:01 spring-projects-issues

The current implementation of the MongoMappingConverter does not allow to use the reactive infrastructure to resolve dbrefs and set them via the accessors. A potential solution would be to identify $dbref already when loading the document and transforming them into a Publisher that loads the referenced document on subscribe, replacing the $dbref with the retrieved (resolved) value before handing the document over to the converter.

christophstrobl avatar Jan 26 '23 13:01 christophstrobl

It is actually possible to identify and load dbrefs in a reactive flow by inspecting the Document and replacing values within on completion of a combined stage which leads to an recursive reflective flow similar to what Flux.expand can do.

The code below outlines how this could be solved.

Mono<Document> prepareDbRefResolution(Mono<Document> root, ReactiveDbRefResolver dbRefResolver) {

    return root.flatMap(source -> {

        // check each element in the document
        for (Entry<String, Object> entry : source.entrySet()) {

            if (entry.getValue() instanceof DBRef dbRef) {
                return prepareDbRefResolution(dbRefResolver.fetch(dbRef).defaultIfEmpty(new Document())
                        .flatMap(it -> prepareDbRefResolution(Mono.just(it), dbRefResolver)).map(resolved -> {
                            source.put(entry.getKey(), resolved.isEmpty() ? null : resolved);
                            return source;
                        }), dbRefResolver);
            }

            // traverse nested documents
            if (value instanceof Document nested) {
                return prepareDbRefResolution(Mono.just(nested), dbRefResolver).map(it -> {
                    source.put(entry.getKey(), it);
                    return source;
                });
            }

            // traverse elements in Map & List

Adding a Cache to the above could reduce server round trips for already known/loaded reference values. Additionally references in Lists and Maps should be collected and batch loaded (restoring original order) to minimize server interaction.

Cyclic references however turn out to be quite an issue when trying to fully resolve documents eagerly at this level. First, the cache becomes a mandatory part allowing to return early for already known bits and serving the very same instance multiple times.

However doing so creates a cycle within the document itself.


{
    "_id" : 1,
    "value" : { "$ref" : { "$id" : 2 }}
}

{
    "_id" : 2,
    "value" : { "$ref" : { "$id" : 1 }}
}
+-> {              
|       _id : 1     
|       value: {     
|          _id : 2,  
|          value : --+
|       }            |
|    }               |
|                    |
+--------------------+

Those cycles lead to infinite loops when calling certain operators (like toString()) on the org.bson.Document itself eg. while computing a trivial log statement. Additionally the mapping layer would not know about where to end within the loop and would be required to check paths for each traversed element to avoid being stuck within one of the loops.

Potential solutions could be:

  • tracking the path and using Proxies or a dedicated sub class of org.bson.Document to indicate back references.
  • shift the problem into one of the DocumentAccessors having more insight into the actual model.
  • throw an exception when detecting load attempts of cycles.

christophstrobl avatar Oct 23 '23 08:10 christophstrobl