R6 icon indicating copy to clipboard operation
R6 copied to clipboard

loss of class on S4 dispatch error

Open mmuurr opened this issue 3 years ago • 2 comments

Edited for simplicity with the root issue. Here I create a R6 class (generator) and an object instance. Then call structure(...) on that instance, which apparently modifies the original object in place:

MyClass <- R6::R6Class("MyClass",
  private = list(
    .field1 = "foo"
  ),
  active = list(
    field1 = function() .field1
  )
)
x <- MyClass$new()
x; class(x)

Output:

 # <MyClass>
 #   Public:
 #     clone: function (deep = FALSE)
 #     field1: active binding
 #   Private:
 #     .field1: foo
 # [1] "MyClass" "R6"

Now, structure it:

y <- structure(x, class = "foo")

And let's examine x:

x; class(x)

Output:

# <environment: 0x7ffa37f77030>
# attr(,"class")
# [1] "foo"

In spirit I understand why this is happening, as it appears structure modifies x in place (i.e. with reference semantics). structure's man page is (to me?) a bit ambiguous about this, with:

structure returns the given object with further attributes set.

And I can't find anything the R6 docs issuing a warning about this. It manifested with jsonlite's asJSON class-based delegation here. With a non-R6 object the original object remains unchanged after walking through that recursive conditional dispatching code with repeated failures and the incremental class 'stripping' (here):

 x <- structure(1:10, class = c("foo", "bar", "baz"))
jsonlite::toJSON(x)
x

Output:

# Error: No method asJSON S3 class: baz
# and x remains unchanged:
#  [1]  1  2  3  4  5  6  7  8  9 10
# attr(,"class")
# [1] "foo" "bar" "baz"

Is this worth a mention/warning in the R6 documentation? I do realize it's up to the class author to carefully manage to which functions she might be passing an R6 object, but given that asJSON had no immediate knowledge of the inner structure of this R6 object (i.e. asJSON wasn't calling any active or public R6 method), it was quite surprising to find the object had been mutated.

mmuurr avatar Sep 17 '20 06:09 mmuurr

It looks like the structure() function will alter reference objects in place. Here's an example with an environment:

e <- new.env()
class(e) <- c("a", "b")
structure(e, class = "c")
e
#> <environment: 0x7fd48b6a8560>
#> attr(,"class")
#> [1] "c"

The particular issue you're running into with jsonlite is because asJSON is making the assumption that the object is a typical non-reference object, but that assumption isn't always correct. That particular line to simulate S3 dispatch is not a standard way of doing things. But I wouldn't classify this as a bug in jsonlite, since all the input types that asJSON is designed to handle are normal non-reference types (I believe).

I don't think this particular issue with structure() merits special mention in the R6 docs, because it's just like any other operation inside a function that modifies an object in place. For most R objects, they have copy-on-write semantics so the operation won't modify the object that's visible to the outside world, but for reference objects, it can do that.

In the example below, suppose one person writes inc_val, assuming that that the input is a non-reference object, and someone else uses it the function but passes in a reference object. Then the result is something that neither of them would expect.

inc_val <- function(x) {
  x$val <- x$val + 1
  x
}

a <- new.env()
a$val <- 1
inc_val(a)

a$val
#> [1] 2

wch avatar Sep 17 '20 13:09 wch

I guess this is simply a risk with reference classes (and similars), but I wonder if there's a way to detect these unintended/unexpected consequences of some how some functions are implemented. Like a pre- and post-call dput comparison as part of a wrapper around a function call where you don't expect any change to your object ... something like:

result <- expect_no_change(x, f, ...)

... where f is called on x and if x changes as a result, a condition is signaled ... or something along those rough lines. This would probably be awfully expensive, but perhaps the wrapper can be enabled/controlled by some option(), like a debug flag.

More generally, I wonder how many such functions are out there, especially how many might conditionally modify an argument in place, which would be harder to reliably detect ahead of time. I suppose it's not too common the case that a users' R6 objects will be passed as args into someone else's functions, with the big exception being generics. In my case, I detected this because of my attempt to build a superclass generic-handler for when my R6 objects are nested in some other structure (e.g. a list) and I want to toJSON() that entire list.

Parent <- R6::R6Class("Parent",
  public = list(
    asJSON = function(...) { do_some_class_specific_JSON_encoding_here }
  )
)
setMethod(getGeneric("asJSON", package = "jsonlite"), "Parent", function(x, ...) x$asJSON(...))

Child <- R6::R6Class("Child",
  inherit = Parent
)

x <- Child$new()
jsonlite::toJSON(x)

That pattern above works (i.e. the Parent's asJSON method is ultimately dispatched), but x loses its "Child" class in the process (because of the structure() call in the generic asJSON("ANY") method). Adding an S4 method explicitly for the Child solves this problem (because then there's an immediate match on class(x)[1] in jsonlite's asJSON("ANY") handler), but it feels redundant/hacky and it also required a reading of jsonlite's internals to come up with.

I think any S3 generic's .default(...) method might suffer the same fate, though and so maybe the moral of the story is when using S3 or S4 generics with one's R6 classes we have to either (i) not rely on the S3 .default method or S4 "ANY" method signature or (ii) inspect those code paths awfully carefully. Though even with those caveats some debug-like mode where unexpected R6 instance mutations can be detected might be helpful, perhaps I'll try to come up with a lightweight implementation for that.

mmuurr avatar Sep 17 '20 22:09 mmuurr