vctrs
vctrs copied to clipboard
vec_slice<-() and unspecified
Currently a vector full of NA
(or even a "vctrs_unspecified"
) never changes the type when populated with vec_slice<-()
.
library(vctrs)
a <- NA
vec_slice(a, 1) <- 2
#> Error: Can't convert from <double> to <logical> due to loss of precision.
#> * Locations: 1
a
#> [1] NA
a <- unspecified(1)
vec_slice(a, 1) <- 2
#> Error: Can't convert <double> to <vctrs_unspecified>.
a
#> <unspecified> [1]
Created on 2020-07-05 by the reprex package (v0.3.0)
Should this work? What's the purpose of the "unspecified" class?
If we keep the current behavior: should we be able to update tibble columns that are initialized with NA
? My gut feeling is that vctrs and tibble should be consistent here, the current approach is safer but requires a typed initialization.
My feeling is that at R level maybe it would make sense that this works, but not at the C level where it's important that the type doesn't change. Does this strictness commonly causes issues? If there's just the linked issue then I'd prefer to wait until we get more feedback, it seems like it's not that common.
What's the purpose of the "unspecified" class?
It's an internal type, see https://vctrs.r-lib.org/reference/internal-faq-ptype2-identity.html
This comes up occasionally, maybe we can solve with documentation?
I don't mind working around in tibble for now.
As discussed before, I think tibble and vctrs should have consistent behaviour.
Adding a workaround to tibble raised the following questions:
- Are
NA_integer_
and"x"
compatible? - Is
tibble(a = integer(), b = logical())
compatible totibble(a = 3, b = "x")
? - Is
tibble(a = 1, b = NA)
compatible totibble(a = 3, b = "x")
?
It seems that we need to bite the bullet and really iterate over a bare logical to decide if it's "unspecified". OTOH, we can decide much faster if the value is of class "unspecified"
.
These questions have long been settled in vctrs.
- Are NA_integer_ and "x" compatible?
- Is tibble(a = integer(), b = logical()) compatible to tibble(a = 3, b = "x") ?
No because only NA
is unspecified.
- Is tibble(a = 1, b = NA) compatible to tibble(a = 3, b = "x") ?
Yes.
It seems that we need to bite the bullet and really iterate over a bare logical to decide if it's "unspecified"
A linear search through an array is the worst case (the unspecified case). And that worst case is pretty fast even for large N.
OTOH, we can decide much faster if the value is of class "unspecified" .
This is an internal class that should only be created by vec_ptype()
(through a linear search).
See also: https://github.com/r-lib/vctrs/issues/1514
And: https://github.com/tidyverse/tidyr/issues/1317
In particular, this is another case we haven't considered here:
It also affects the case of `vec_assign(NA, 1L, 0)`, where `0` is cast to `FALSE` in the current behavior.
It we changed this so the LHS is cast to the type of the RHS in this case, we'd get `0` as the result.