assertr
assertr copied to clipboard
assert has error for single-row data frames and multiple columns
assert
errors when making an assertion on multiple columns for a single-row data frame.
My guess is that that the issue involves the internals somewhere incorrectly treating the row as a vector, rather than as multiple length-one vectors. assert
works fine with one-row-one-column, or multiple-rows-multiple-columns
Reproducible example, asserting on two columns with a single-row data frame:
library(assertr)
library(magrittr)
head(mtcars, 1) %>% assert(not_na, am, vs)
#> Error in dimnames(x) <- dn: length of 'dimnames' [2] not equal to array extent
packageVersion("assertr")
#> [1] '2.7'
Note that this error does not occur when asserting about only a single column for a single-row data frame:
library(assertr)
library(magrittr)
head(mtcars, 1) %>% assert(not_na, am)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
or asserting about multiple columns with multiple-row data frames:
library(assertr)
library(magrittr)
head(mtcars, 2) %>% assert(not_na, am, vs)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
Created on 2020-03-05 by the reprex package (v0.3.0)
I have this same issue. Is there any resolution or workarounds for this at present?
Sorry for the delay. I can't replicate this bug on version 2.8. Can you try that? Should be on CRAN
Hi, I am experiencing a similar issue as @jayqi mentioned in the beginning of this thread. I am unable to replicate the same behaviour in version 2.8 as well, but I have another one. In my case, the assert
statement is not executed properly for a single-row data frame, when it actually contains an occurrence of the specified condition that it is checking.
packageVersion("assertr")
#> [1] ‘2.8’
Here is a reproducible example for 4 different data frames:
library(assertr)
library(magrittr)
data("mtcars")
Defining 4 different data frames:
data_multirow <- mtcars
data_multirow_with_NA <- data_multirow
data_multirow_with_NA[1,1] <- NA
data_single_row <- head(data_multirow, n = 1)
data_single_row_with_NA <- head(data_multirow_with_NA, n = 1)
Checking the mpg column across 4 data frames:
- multi-row data frame without any NAs
data_multirow %>% assert(not_na, mpg)
# prints the data frame
- multi-row data frame with NA
data_multirow_with_NA %>% assert(not_na, mpg)
#> Column 'mpg' violates assertion 'not_na' 1 time
#> verb redux_fn predicate column index value
#> 1 assert NA not_na mpg 1 NA
#>
#> Error: assertr stopped execution
- single-row data frame without any NAs
data_single_row %>% assert(not_na, mpg)
# prints the data frame
- single-row data frame with NA
data_single_row_with_NA %>% assert(not_na, mpg)
#> Error: assertr stopped execution
As you can see, the last data frame example lacks the proper output.
After some investigation, it looks like the issue is in using the sapply
function when creating log.mat
inside the assert
function definition:
log.mat <- sapply(colnames(sub.frame), function(column) {
this.vector <- sub.frame[[column]]
return(apply.predicate.to.vector(this.vector, predicate))
})
When we have a multi-row data frame, this assignment returns a data.frame
type, however, when the input is one row only, a vector
type is returned. Therefore, later in the part where errors are assigned (errors <- lapply(...
), the statement colnames(log.mat)
is NULL
, as there are no columns. I believe that because of that, for the single-row data frames which actually will produce errors in the assert
statement, there are no errors produced, even though they should be.
@tonyfischetti could you please have a look at this? I will be grateful for a comment regarding this issue :)
@mariadrywien I'm working on this now. Should bew done by the end of the day
@mariadrywien Just fixed it. It's in master now. I'm submitting to CRAN now
@tonyfischetti Thanks a lot! Works fine now 🙂
Hi @tonyfischetti
Unfortunately, the issue is still present in the case when you use assert
with multiple columns. Please check this example (based on the previous one):
library(assertr)
library(magrittr)
data("mtcars")
data_multirow_with_NA <- mtcars
data_multirow_with_NA[1,1] <- NA
data_single_row_with_NA <- head(data_multirow_with_NA, n = 1)
# single-row data frame with NA
data_single_row_with_NA %>% assert(not_na, mpg, cyl)
#>Error: assertr stopped execution