vctrs
vctrs copied to clipboard
`anyDuplicated.vctrs_vctr` difference undocumented
This could be related to https://github.com/r-lib/vctrs/issues/180 and how anyDuplicated.vctrs_vctr
was original conceived.
vctrs:::anyDuplicated.vctrs_vctr
uses vec_duplicate_any
which produces results similar to anyDuplicated()
(as documented) but with the exported S3 method the difference isn't explicitly documented and may not be obvious.
I ran into this error using haven::labelled
but traced it to the use of the vctrs_vctr
class:
x <- c(3, 1, 2)
y <- c(x, 3)
anyDuplicated(x)
#> [1] 0
anyDuplicated(y)
#> [1] 4
x <- haven::labelled(c(3, 2, 1))
y <- c(x, 3)
anyDuplicated(x)
#> [1] FALSE
anyDuplicated(y)
#> [1] TRUE
# quick fix
anyDuplicated(unclass(x))
#> [1] 0
anyDuplicated(unclass(y))
#> [1] 4
Created on 2021-09-20 by the reprex package (v2.0.1)
We should probably be more compatible with what anyDuplicated()
is supposed to return. So something like this:
library(vctrs)
library(rlang)
any_duplicated_vctr <- function(x,
incomparables = FALSE,
fromLast = FALSE,
...) {
if (!is_false(incomparables)) {
warn("The <vctrs_vctr> method for `anyDuplicated()` does not respect `incomparables`.")
}
if (!is_bool(fromLast)) {
abort("`fromLast` must be a single `TRUE` or `FALSE`.")
}
duplicates <- vec_duplicate_detect(x)
duplicates <- which(duplicates)
if (length(duplicates) == 0L) {
return(0L)
}
# `vec_duplicate_detect()` returns first and all subsequent repeats,
# but `anyDuplicated()` only returns the 2nd repeat
if (fromLast) {
i <- length(duplicates) - 1L
} else {
i <- 2L
}
duplicates[[i]]
}
x <- c(1, 1, 2, 2)
anyDuplicated(x)
#> [1] 2
any_duplicated_vctr(x)
#> [1] 2
anyDuplicated(x, fromLast = TRUE)
#> [1] 3
any_duplicated_vctr(x, fromLast = TRUE)
#> [1] 3
Created on 2021-09-20 by the reprex package (v2.0.0.9000)
Update: But this isn't quite right:
> anyDuplicated(c(1, 2, 1, 2))
[1] 3
> anyDuplicated.vctrs_vctr(c(1, 2, 1, 2))
[1] 2