validate
validate copied to clipboard
`check_that` throws cryptic errors when run after a pipeline step fails
Sorry in advance if this has already been asked -- I haven't seen anything about it. Pasting a reprex
that does a better job explaining what's going on than I can:
suppressPackageStartupMessages({
library(dplyr)
library(validate)
})
iris %>%
filter(
foobar > 2
)
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foobar > 2`.
#> x object 'foobar' not found
iris %>%
filter(
foo > 3
) %>%
mutate(
bar = Sepal.Length + 1
)
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
## problem:
iris %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
)
#> Error in (function (cond) : error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
## big problem:
iris %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
Created on 2021-08-13 by the reprex package (v2.0.0)
Basically, filter
fails and then check_that
seems to not know what to do, so it spits out a bunch of junk before printing an actual error. This is a bigger issue when you chain 100 steps together though, because then it spits out too much junk to actually print / parse, so it's hard to figure out what's actually going wrong.
Is this a known issue / conscious choice? And if it is, what's the best way to handle this behavior?
Thanks!
The first error msg I see comes from filter. This is not a validate function. I'd go after that first. Maybe use dplyr::filter.and similar for the other functions? (I'm not near a computer now so I can't test)
@markvanderloo Oh yeah, I know the error is coming from dplyr::filter()
. The point here was that the checks run fine when filter()
works (and it does, I just told it to filter by foo
which doesn't exist), but when filter()
bombs, it seems like I just get this long, unhelpful error message. Here's a reprex
to show filter
working:
suppressPackageStartupMessages({
library(dplyr)
library(validate)
})
iris <- tibble::as_tibble(iris)
## Filtering works fine
iris %>%
filter(
Sepal.Length > 5
)
#> # A tibble: 118 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 5.4 3.9 1.7 0.4 setosa
#> 3 5.4 3.7 1.5 0.2 setosa
#> 4 5.8 4 1.2 0.2 setosa
#> 5 5.7 4.4 1.5 0.4 setosa
#> 6 5.4 3.9 1.3 0.4 setosa
#> 7 5.1 3.5 1.4 0.3 setosa
#> 8 5.7 3.8 1.7 0.3 setosa
#> 9 5.1 3.8 1.5 0.3 setosa
#> 10 5.4 3.4 1.7 0.2 setosa
#> # … with 108 more rows
## Filtering and then piping into check_that also works fine
iris %>%
filter(
Sepal.Length > 5
) %>%
check_that(
Sepal.Length > 5
)
#> Object of class 'validation'
#> Call:
#> check_that(., Sepal.Length > 5)
#>
#> Rules confronted: 1
#> With fails : 0
#> With missings: 0
#> Threw warning: 0
#> Threw error : 0
## problem: When `filter` fails, `check_that` throws a gibberish error
iris %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
)
#> Error in (function (cond) : error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
## big problem: Multiple `check_that`s means multiple repetitions of this error
## arbitrarily many of them, as the chain gets bigger
iris %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
) %>%
filter(
foo > 3
) %>%
check_that(
Sepal.Length < 1000
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
## When filter fails and pipes into mutate, however, we still
## get the same informative error we'd expect
iris %>%
filter(
foo > 3
) %>%
mutate(
bar = Sepal.Length + 1
)
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
Created on 2021-08-13 by the reprex package (v2.0.0)
I'm not an expert of the codebase, but my suspicion from this error is that there's an S3 method for confront
that's getting NULL
or something similar, and is trying to call confront.null
and is getting confused. Again, not an expert but that's what seems like might be happening.
Also, FWIW, my mental model for what should happen here is that check_that
shouldn't run if filter
fails (just like mutate
doesn't, or at least doesn't seem to). It seems to me like that's the real issue
Ok, so I now understand your question better. The error:
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
is not thrown by check_that()
. One clue is that it uses colorized output, which we use nowhere in validate
. So it must ultimately come from dplyr
or magrittr
. If I use the R pipe, I get the same message so it must be dplyr
or one of its dependencies.
> iris |> filter(foo>3) |> check_that(Sepal.Length>0)
Error in (function (cond) :
error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
ℹ Input `..1` is `foo > 3`.
✖ object 'foo' not found
Thanks @markvanderloo. I think we're still getting our wires crossed. I know the error
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found
is coming from dplyr
, but that isn't the issue I'm talking about. What I'm talking about is this piece of the error:
Error in (function (cond) :
error in evaluating the argument 'dat' in selecting a method for function 'confront'
which is clearly coming from validate
, not dplyr
or magrittr
. I think the fact that you get the same behavior with the base R pipe is good evidence of that, but here's a reprex
using no dplyr
that shows the same issue:
library(validate)
library(magrittr)
## Broken, but no dplyr
iris %>%
subset(
foo > 2
) %>%
check_that(
Sepal.Length < 100
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': object 'foo' not found
## Also broken, but no dplyr and no pipe
check_that(
subset(
iris,
foo > 2
),
Sepal.Length < 100
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': object 'foo' not found
## Works fine
check_that(
subset(
iris,
Sepal.Length < 6
),
Sepal.Length < 100
)
#> Object of class 'validation'
#> Call:
#> check_that(subset(iris, Sepal.Length < 6), Sepal.Length < 100)
#>
#> Rules confronted: 1
#> With fails : 0
#> With missings: 0
#> Threw warning: 0
#> Threw error : 0
Created on 2021-08-15 by the reprex package (v2.0.1)
What I'm saying is that I think that validate::check_that
doesn't know what to do when something that gets passed into its dat
argument throws an error, which seems to me to be a bug. Let me know if I can be more helpful than this or if it's still not clear what I'm getting at. I think this has nothing to do with dplyr
or magrittr
though. It really seems to me like it might be an S3-related error in validate
or confront
, but I'm not 100% sure.
To reiterate from before: My mental model for what should happen in check_that
when dat
throws an error (like here) is that check_that
should throw the same error without adding this other simpleError
stuff:
Error in (function (cond) :
error in evaluating the argument 'dat' in selecting a method for function 'confront'
I would expect check_that
to error out without throwing that error message, and just do what dplyr
does and throw the first error. Here's a base R reprex
for the behavior nesting like this gives in subset
when there's an error in the inner function:
subset(
subset(
iris,
foo > 2
),
Sepal.Length > 3
)
#> Error in eval(e, x, parent.frame()): object 'foo' not found
Created on 2021-08-15 by the reprex package (v2.0.1)
That's the kind of behavior I'd expect from check_that
, too.
Let me know if this doesn't clear up what I'm thinking