vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

vec_slice<-() and unspecified

Open krlmlr opened this issue 3 years ago • 8 comments

Currently a vector full of NA (or even a "vctrs_unspecified") never changes the type when populated with vec_slice<-().

library(vctrs)

a <- NA
vec_slice(a, 1) <- 2
#> Error: Can't convert from <double> to <logical> due to loss of precision.
#> * Locations: 1
a
#> [1] NA

a <- unspecified(1)
vec_slice(a, 1) <- 2
#> Error: Can't convert <double> to <vctrs_unspecified>.
a
#> <unspecified> [1]

Created on 2020-07-05 by the reprex package (v0.3.0)

Should this work? What's the purpose of the "unspecified" class?

If we keep the current behavior: should we be able to update tibble columns that are initialized with NA? My gut feeling is that vctrs and tibble should be consistent here, the current approach is safer but requires a typed initialization.

krlmlr avatar Jul 05 '20 16:07 krlmlr

My feeling is that at R level maybe it would make sense that this works, but not at the C level where it's important that the type doesn't change. Does this strictness commonly causes issues? If there's just the linked issue then I'd prefer to wait until we get more feedback, it seems like it's not that common.

lionel- avatar Jul 06 '20 08:07 lionel-

What's the purpose of the "unspecified" class?

It's an internal type, see https://vctrs.r-lib.org/reference/internal-faq-ptype2-identity.html

lionel- avatar Jul 06 '20 08:07 lionel-

This comes up occasionally, maybe we can solve with documentation?

krlmlr avatar Jul 06 '20 13:07 krlmlr

I don't mind working around in tibble for now.

krlmlr avatar Mar 18 '21 06:03 krlmlr

As discussed before, I think tibble and vctrs should have consistent behaviour.

lionel- avatar Mar 18 '21 08:03 lionel-

Adding a workaround to tibble raised the following questions:

  • Are NA_integer_ and "x" compatible?
  • Is tibble(a = integer(), b = logical()) compatible to tibble(a = 3, b = "x") ?
  • Is tibble(a = 1, b = NA) compatible to tibble(a = 3, b = "x") ?

It seems that we need to bite the bullet and really iterate over a bare logical to decide if it's "unspecified". OTOH, we can decide much faster if the value is of class "unspecified" .

krlmlr avatar Jul 21 '21 03:07 krlmlr

These questions have long been settled in vctrs.

  • Are NA_integer_ and "x" compatible?
  • Is tibble(a = integer(), b = logical()) compatible to tibble(a = 3, b = "x") ?

No because only NA is unspecified.

  • Is tibble(a = 1, b = NA) compatible to tibble(a = 3, b = "x") ?

Yes.

It seems that we need to bite the bullet and really iterate over a bare logical to decide if it's "unspecified"

A linear search through an array is the worst case (the unspecified case). And that worst case is pretty fast even for large N.

OTOH, we can decide much faster if the value is of class "unspecified" .

This is an internal class that should only be created by vec_ptype() (through a linear search).

lionel- avatar Jul 21 '21 05:07 lionel-

See also: https://github.com/r-lib/vctrs/issues/1514

And: https://github.com/tidyverse/tidyr/issues/1317

In particular, this is another case we haven't considered here:

It also affects the case of `vec_assign(NA, 1L, 0)`, where `0` is cast to `FALSE` in the current behavior.

It we changed this so the LHS is cast to the type of the RHS in this case, we'd get `0` as the result.

DavisVaughan avatar Feb 08 '22 16:02 DavisVaughan