Wishlist-for-R icon indicating copy to clipboard operation
Wishlist-for-R copied to clipboard

ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Open HenrikBengtsson opened this issue 6 years ago • 8 comments

Idea

In the spirit of Issue #38 (if/while (c(TRUE, TRUE)) ...) of giving a warning (soon error), @hadley proposed in a Tweet:

@henrikbengtsson as part of new if() warning, I wonder if && and || should give warning when collapsing vector to scalar

Issue

Today we have that x || y performs x[1] || y for length(x) > 1. For instance,

> c(TRUE, TRUE) || FALSE
[1] TRUE
> c(TRUE, FALSE) || FALSE
[1] TRUE
> c(TRUE, NA) || FALSE
[1] TRUE
> c(FALSE, TRUE) || FALSE
[1] FALSE

This property is symmetric in LHS and RHS (i.e. y || x behaves the same) and it also applies to x && y.

The issue is that the above truncation of x is completely silent -there's neither an error nor a warning being produced.

Discussion/Suggestion

Using x || y and x && y with a non-scalar x or y is likely a mistake. Either the code is written assuming x and y are scalars, or there is a coding error and vectorized versions x | y and x & y were intended. Should x || y always be considered an mistake if length(x) != 1 or length(y) != 1? If so, should it be a warning or an error? For instance,

> x <- c(TRUE, TRUE)
> y <- FALSE
> x || y

Error in x || y : applying scalar operator || to non-scalar elements
Execution halted

What about the case where length(x) == 0 or length(y) == 0? Today x || y returns NA in such cases, e.g.

> logical(0) || c(FALSE, NA)
[1] NA
> logical(0) || logical(0)
[1] NA
> logical(0) && logical(0)
[1] NA

I don't know the background for this behavior, but I'm sure there is an argument behind that one. Maybe it's simply that || and && should always return a scalar logical and neither TRUE nor FALSE can be returned.

HenrikBengtsson avatar Nov 28 '17 18:11 HenrikBengtsson

I would suggest a warning for symmetry with:

> 1:2 + 1:3
[1] 2 4 4
Warning message:
In 1:2 + 1:3 :
  longer object length is not a multiple of shorter object length

Seems like a similar type of error. Changing the NA behavior seems like a much higher hurdle because if anyone ran into this in the past they would have noticed (e.g. if(NA)) so existing code may actually expet the NA in some cases.

I'm happy to help with this to the extent my abilities and other demands allow me to.

brodieG avatar Nov 28 '17 23:11 brodieG

UPDATE 2018-09-12: R-devel r75289 just gained support for this:

NEWS: Experimentally, setting environment variable _R_CHECK_LENGTH_1_LOGIC2_ will lead to warnings (or errors if the variable is set to a 'true' value) when && or || encounter and use arguments of length more than one.

HenrikBengtsson avatar Sep 12 '18 17:09 HenrikBengtsson

Hmm...

The _R_CHECK_LENGTH_1_LOGIC2_=true check on LHS || RHS applies only to the LHS and not the RHS:

> Sys.setenv("_R_CHECK_LENGTH_1_LOGIC2_"="true")
> c(TRUE, TRUE) || TRUE
Error in c(TRUE, TRUE) || TRUE : 
  'length(x) = 2 > 1' in coercion to 'logical(1)'

> TRUE || c(TRUE, TRUE)
[1] TRUE

For LHS && RHS it works as expected, e.g.

> c(TRUE, TRUE) && TRUE
Error in c(TRUE, TRUE) && TRUE : 
  'length(x) = 2 > 1' in coercion to 'logical(1)'

> TRUE && c(TRUE, TRUE)
Error in TRUE && c(TRUE, TRUE) : 
  'length(x) = 2 > 1' in coercion to 'logical(1)'

Verified on R version 3.6.1 (2019-07-05), R version 3.6.1 Patched (2019-09-12 r77183), and R Under development (unstable) (2019-10-12 r77279).

UPDATE 2019-10-12: I've filed PR17630 about this problem.

UPDATE 2019-10-12: This comment was invalid because:

Doh... brain fried. To further clarify Peter's point/explanation for future eyes. The following is indeed expected:

Sys.setenv("R_CHECK_LENGTH_1_LOGIC2"="true") TRUE || c(TRUE, TRUE) [1] TRUE FALSE || c(TRUE, TRUE) Error in FALSE || c(TRUE, TRUE) : 'length(x) = 2 > 1' in coercion to 'logical(1)'

and

FALSE && c(TRUE, TRUE) [1] FALSE TRUE && c(TRUE, TRUE) Error in TRUE && c(TRUE, TRUE) : 'length(x) = 2 > 1' in coercion to 'logical(1)'

It's only when R needs evaluates the RHS, to resolve 'LHS || RHS' or 'LHS && RHS', that it will be evaluated(*). Without evaluating RHS, it is not possible to know length(RHS). (I'm pretty sure this was also discussed in length before on R-devel when 'R_CHECK_LENGTH_1_LOGIC2' was first proposed.)

(*) This is document in help("Logic", package="base") as:

The longer form [&& and ||] evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined.

HenrikBengtsson avatar Oct 12 '19 21:10 HenrikBengtsson

UPDATE 2022-02-23: _R_CHECK_LENGTH_1_LOGIC2_=warn is the new default in R-devel (to become R 4.2.0), cf. https://github.com/wch/r-source/commit/b3b9f66345aea1259b50cd4a9fd29c8d5ce5f3a1

HenrikBengtsson avatar Feb 23 '22 22:02 HenrikBengtsson

UPDATE 2022-04-27: This will now be an error in R-devel (to become R 4.3.0), cf. Make calling && or || with either argument of length greater than one into an error.

HenrikBengtsson avatar Apr 29 '22 16:04 HenrikBengtsson

UPDATE 2022-06-02: This is now an error in R-devel (to become R 4.3.0) and there's no longer an _R_CHECK_LENGTH_1_LOGIC2_ environment variable, cf. remove _R_CHECK_LENGTH_1_LOGIC2_ .

HenrikBengtsson avatar Jun 02 '22 18:06 HenrikBengtsson

Related:

HenrikBengtsson avatar Sep 16 '22 11:09 HenrikBengtsson

UPDATE: R 4.3.0 (2023-04-21) now produces an error whenever length(x) != 1 in x || y and x && y, or when length(y) != 1 in x || y.

HenrikBengtsson avatar May 02 '23 03:05 HenrikBengtsson