R6 icon indicating copy to clipboard operation
R6 copied to clipboard

Cloning of classes with objects that reference each other

Open mb706 opened this issue 5 years ago • 4 comments

I already commented on this in #110, but I'd like to get an "official" opinion on this (and I could make a PR if this is something desirable). Currently it is very difficult to clone(deep = TRUE) an object that has member objects that reference each other.

An example would be the following: r2 contains object y, which references object x. Changing x through the reference in y is possible in the original object obj1, but not the clone obj2:

rr = R6::R6Class("test",
  public = list(
    ref = NULL,
    val = 0,
    initialize = function(ref) {
      self$ref <- ref
    }
  )
)

r2 = R6::R6Class("test",
  public = list(
    x = NULL,
    y = NULL,
    initialize = function() {
      self$x <- rr$new(NULL)
      self$y <- rr$new(self$x)
    }
  )
)

obj1 <- r2$new()
obj2 <- obj1$clone(deep = TRUE)

obj1$y$ref$val <- 100
print(obj1$x$val)  # prints '100', as it should

obj2$y$ref$val <- 200
print(obj2$x$val)  # prints '0'

A workaround was posted by @dfalbel, but there are two drawbacks to this: (1) overwriting the clone method completely means having to construct the objects from scratch, and (2) the method is less than well-supported and actually seems to be working around the restriction built into R6::R6Class which seems to try to prevent custom clone methods.

It would be useful to have the option to let some code execute after most of the clone() work was done, to do some post-processing on the cloned object. In the example above it would then be possible to repair the reference in the y$ref object, for example. An API that I suggest is to have give the user the option to define a private$post_clone(old_self) method, that gets called at the end of clone(). Changes required would probably be inserting the lines

if (deep && has_private && is.function(new[[1]]$private$post_clone)) {
  new[[1]]$private$post_clone(old_1_binding)
}
new_1_binding

at the very end of he clone function.

I would be interested in hearing your opinion on this.

mb706 avatar Feb 07 '19 17:02 mb706

I like the idea of a post_clone() method. Another idea, as was suggested in #110, is to rename the current clone method to .clone, and then define clone = function() self$.clone() by default and allow users to override it.

One advantage that post_clone() would have over .clone() is that, because it is run from the new object, it would allow access to the new object's private members; the .clone method, since it is run from the old object, would only allow access to the new object's public members. But I can imagine that on the original object, you'd want to do some stuff pre and/or post cloning. So it might make sense to do both.

This would also allow users to make deep_clone=TRUE the default for a class.

I think that when it comes to replacing the built-in clone method entirely with a custom version (the workaround suggested in #110), it is basically impossible to expect a class authors to get it right in the general case, so it would be good to allow them to call R6's built-in cloning code and then massage the result afterward. (Getting cloning right for simple classes is not too hard, but when inheritance and active bindings come into play, it becomes very, very hard. The are currently almost 1000 lines of code for the cloning tests.)

wch avatar Feb 15 '19 15:02 wch

I am mostly using R6 to wrap C++ classes. In my use case, I have a self$pointer and all methods are sending this pointer to cpp make the calcs. When cloning this class, I just need to create a new object and a new pointer in the cpp side and then create a Class$new(cpp_ptr) object. In this case it seems safe to override the clone method behavior.

Also, not modifying the clone behavior here is just wrong. Using self$clone would not clone the class at all.

I like the post_clone idea and I think it covers most use cases, but IMHO the .clone method would be a better approach.

dfalbel avatar Feb 15 '19 17:02 dfalbel

I also really like the idea of a either a post_clone or .clone method. I have an issue pegeler/eddington2#4 that is almost identical to #178.

Currently I have set cloneable = FALSE and plan to try to create a custom clone method. However, as @wch points out, this will likely end up being pretty difficult given the way I've set up my class so far. Luckily, with some modifications to my approach, I can take a few shortcuts to make things easier. :smiley:

Has any more thought been given to implementing one of these methods in a future release of this package?

pegeler avatar Mar 29 '20 03:03 pegeler

A lot of the examples of cloning challenges with nested R6 objects have used the relatively simple o1 -> o2 -> o1 reference cycle (e.g. the copy solutions provided in #110). I wonder if there's been conversation/planning (or appetite at all) to try to tackle the more general problem of cycles in the reference graph while deep-cloning (i.e. o1 -> o2 -> o3 -> ... -> o1). In other reference-based languages, most often these cloning operations require some state-management during the recursive cloning process (i.e. state shared through the recursive call stack).

Have such methods been discussed (or are in the works) for R6 classes? I'm considering implementing my own (very simplified & specific) version of this for a few of my use cases, and so I'm wondering if I should (i) perhaps just wait, (ii) join/listen-in to ongoing conversations about similar efforts, or (iii) consider generalizing my solutions a bit more to share with others.

mmuurr avatar Aug 30 '21 16:08 mmuurr