S4Vectors icon indicating copy to clipboard operation
S4Vectors copied to clipboard

bug in `recycleSingleBracketReplacementValue`?

Open Liubuntu opened this issue 6 years ago • 7 comments

Hi Michael,

I was testing on the DataFrame after refactoring of S4Vectors with new internal functions, and I got this warning / error for replacement value:

packageVersion("S4Vectors")
## [1] ‘0.21.24’
aa <- DataFrame(letters, numbers = 26:1)
aa[1,] <- c("aa", 1)
## Warning message:
## In recycleSingleBracketReplacementValue(value, x, i) :
##   number of values supplied is not a sub-multiple of the number of values to be replaced

aa
## DataFrame with 26 rows and 2 columns
##         letters     numbers
##     <character> <character>
## 1            aa          aa
## 2             b          25
## 3             c          24
## 4             d          23
## 5             e          22
## ...         ...         ...
## 22            v           5
## 23            w           4
## 24            x           3
## 25            y           2
## 26            z           1

row replacement with single value works!

aa[2, ] <- "bb"
aa
## DataFrame with 26 rows and 2 columns
##         letters     numbers
##     <character> <character>
## 1            aa          aa
## 2            bb          bb
## 3             c          24
## 4             d          23
## 5             e          22
## ...         ...         ...
## 22            v           5
## 23            w           4
## 24            x           3
## 25            y           2
## 26            z           1

Liubuntu avatar Apr 26 '19 16:04 Liubuntu

I'm inclined to declare that this is unsupported. It's true that data.frame supports this by wrapping the vector up into a matrix and essentially converting that to a data.frame. However, that is complex behavior and generally has undesirable properties (in your case, the "numbers" column becomes character). It would be much better to create a list with list("aa", 1L), because it is more obvious that those values correspond to columns.

lawremi avatar Apr 26 '19 17:04 lawremi

Hi Michael, Given that DataFrame (like data.frame) already supports a matrix on the right of the subassignment, it would seem natural that it also supports an atomic vector. H.

hpages avatar Apr 26 '19 19:04 hpages

Seems like a different case to me. The way I think about it (and how it's implemented) is that the RHS is coerced to a DataFrame. A matrix has an obvious coercion to a DataFrame, but the behavior for an atomic vector would be different.

lawremi avatar Apr 26 '19 20:04 lawremi

Actually, it's also broken when the right value is a matrix:

> ab <- DataFrame(a=11:14, b=21:24)
> ab[2:4, ] <- matrix(letters[1:6], nrow=3)
> ab
DataFrame with 4 rows and 2 columns
          a         b
  <integer> <integer>
1        11        21
2         1         1
3         2         2
4         3         3

hpages avatar Apr 26 '19 20:04 hpages

That's expected given the simple rule of coercing to a DataFrame. It's only surprising because what happens when subassigning a factor into an integer vector is surprising.

To stick to the principle of mimicking the behavior of data.frame here, I could set stringsAsFactors=FALSE during coercion to the DataFrame. The base code gets away with breaking everything down to a list but that approach does not fit well with the strictness of S4Vectors.

lawremi avatar Apr 26 '19 20:04 lawremi

It seems to me that coercion from character matrix to DataFrame should be fixed in general, not just in the context of coercing the RHS to DataFrame. Other coercions from character-based object to DataFrame have broke free long ago from the non-sense factor thing.

Once [<- with an RHS matrix is corrected, it should be pretty straight forward to support an atomic vector RHS (would just need to be recycled to the appropriate length and turned into a matrix with the appropriate geometry).

Hope that makes sense. Thanks!

hpages avatar Apr 26 '19 21:04 hpages

I guess we can go that way, arguing for consistency with the other coercions.

There's nothing wrong with matrix RHS per se; there's just the general issues with character to factor coercion. The term "straight-forward" is a relative. This stuff is pretty tricky; every change seems to have a domino effect. Check the previous commit to see a solution I drafted up; it was not as clean as I had hoped.

lawremi avatar Apr 26 '19 22:04 lawremi