spring-data-mongodb
spring-data-mongodb copied to clipboard
Investigate reactive lazy DBRef resolution [DATAMONGO-1583]
Mark Paluch opened DATAMONGO-1583 and commented
We should investigate on a way to resolve Mongo database references (@DBRef) lazily retaining reactive semantics.
Currently, lazy @DBRef resolution incurs blocking data access as it retrieves the underlying reference(s) when accessing methods of the reference.
Applying the same resolution easily leads to unwanted, blocking access. A reactive reference type should reflect
- reactivity
- multiplicity
- possibly access to the underlying reference Id's
Reactive @DBRef's should allow multiple reads of the referenced objects
Affects: 2.0 M1 (Kay)
Issue Links:
- DATAMONGO-1584 Support DBRef in Spring Data Mongo Reactive
The current implementation of the MongoMappingConverter does not allow to use the reactive infrastructure to resolve dbrefs and set them via the accessors.
A potential solution would be to identify $dbref already when loading the document and transforming them into a Publisher that loads the referenced document on subscribe, replacing the $dbref with the retrieved (resolved) value before handing the document over to the converter.
It is actually possible to identify and load dbrefs in a reactive flow by inspecting the Document and replacing values within on completion of a combined stage which leads to an recursive reflective flow similar to what Flux.expand can do.
The code below outlines how this could be solved.
Mono<Document> prepareDbRefResolution(Mono<Document> root, ReactiveDbRefResolver dbRefResolver) {
return root.flatMap(source -> {
// check each element in the document
for (Entry<String, Object> entry : source.entrySet()) {
if (entry.getValue() instanceof DBRef dbRef) {
return prepareDbRefResolution(dbRefResolver.fetch(dbRef).defaultIfEmpty(new Document())
.flatMap(it -> prepareDbRefResolution(Mono.just(it), dbRefResolver)).map(resolved -> {
source.put(entry.getKey(), resolved.isEmpty() ? null : resolved);
return source;
}), dbRefResolver);
}
// traverse nested documents
if (value instanceof Document nested) {
return prepareDbRefResolution(Mono.just(nested), dbRefResolver).map(it -> {
source.put(entry.getKey(), it);
return source;
});
}
// traverse elements in Map & List
Adding a Cache to the above could reduce server round trips for already known/loaded reference values.
Additionally references in Lists and Maps should be collected and batch loaded (restoring original order) to minimize server interaction.
Cyclic references however turn out to be quite an issue when trying to fully resolve documents eagerly at this level. First, the cache becomes a mandatory part allowing to return early for already known bits and serving the very same instance multiple times.
However doing so creates a cycle within the document itself.
{
"_id" : 1,
"value" : { "$ref" : { "$id" : 2 }}
}
{
"_id" : 2,
"value" : { "$ref" : { "$id" : 1 }}
}
+-> {
| _id : 1
| value: {
| _id : 2,
| value : --+
| } |
| } |
| |
+--------------------+
Those cycles lead to infinite loops when calling certain operators (like toString()) on the org.bson.Document itself eg. while computing a trivial log statement.
Additionally the mapping layer would not know about where to end within the loop and would be required to check paths for each traversed element to avoid being stuck within one of the loops.
Potential solutions could be:
- tracking the path and using
Proxiesor a dedicated sub class oforg.bson.Documentto indicate back references. - shift the problem into one of the
DocumentAccessorshaving more insight into the actual model. - throw an exception when detecting load attempts of cycles.