S4Vectors
S4Vectors copied to clipboard
bug in `recycleSingleBracketReplacementValue`?
Hi Michael,
I was testing on the DataFrame after refactoring of S4Vectors with new internal functions, and I got this warning / error for replacement value:
packageVersion("S4Vectors")
## [1] ‘0.21.24’
aa <- DataFrame(letters, numbers = 26:1)
aa[1,] <- c("aa", 1)
## Warning message:
## In recycleSingleBracketReplacementValue(value, x, i) :
## number of values supplied is not a sub-multiple of the number of values to be replaced
aa
## DataFrame with 26 rows and 2 columns
## letters numbers
## <character> <character>
## 1 aa aa
## 2 b 25
## 3 c 24
## 4 d 23
## 5 e 22
## ... ... ...
## 22 v 5
## 23 w 4
## 24 x 3
## 25 y 2
## 26 z 1
row replacement with single value works!
aa[2, ] <- "bb"
aa
## DataFrame with 26 rows and 2 columns
## letters numbers
## <character> <character>
## 1 aa aa
## 2 bb bb
## 3 c 24
## 4 d 23
## 5 e 22
## ... ... ...
## 22 v 5
## 23 w 4
## 24 x 3
## 25 y 2
## 26 z 1
I'm inclined to declare that this is unsupported. It's true that data.frame supports this by wrapping the vector up into a matrix and essentially converting that to a data.frame. However, that is complex behavior and generally has undesirable properties (in your case, the "numbers" column becomes character). It would be much better to create a list with list("aa", 1L), because it is more obvious that those values correspond to columns.
Hi Michael, Given that DataFrame (like data.frame) already supports a matrix on the right of the subassignment, it would seem natural that it also supports an atomic vector. H.
Seems like a different case to me. The way I think about it (and how it's implemented) is that the RHS is coerced to a DataFrame. A matrix has an obvious coercion to a DataFrame, but the behavior for an atomic vector would be different.
Actually, it's also broken when the right value is a matrix:
> ab <- DataFrame(a=11:14, b=21:24)
> ab[2:4, ] <- matrix(letters[1:6], nrow=3)
> ab
DataFrame with 4 rows and 2 columns
a b
<integer> <integer>
1 11 21
2 1 1
3 2 2
4 3 3
That's expected given the simple rule of coercing to a DataFrame. It's only surprising because what happens when subassigning a factor into an integer vector is surprising.
To stick to the principle of mimicking the behavior of data.frame here, I could set stringsAsFactors=FALSE during coercion to the DataFrame. The base code gets away with breaking everything down to a list but that approach does not fit well with the strictness of S4Vectors.
It seems to me that coercion from character matrix to DataFrame should be fixed in general, not just in the context of coercing the RHS to DataFrame. Other coercions from character-based object to DataFrame have broke free long ago from the non-sense factor thing.
Once [<- with an RHS matrix is corrected, it should be pretty straight forward to support an atomic vector RHS (would just need to be recycled to the appropriate length and turned into a matrix with the appropriate geometry).
Hope that makes sense. Thanks!
I guess we can go that way, arguing for consistency with the other coercions.
There's nothing wrong with matrix RHS per se; there's just the general issues with character to factor coercion. The term "straight-forward" is a relative. This stuff is pretty tricky; every change seems to have a domino effect. Check the previous commit to see a solution I drafted up; it was not as clean as I had hoped.