performance
performance copied to clipboard
Odd plots for categorical variable only models
It attempting to look at lm()
objects with only categorical predictors and evaluating fitted v. residuals (for linearity) or sqrt(abs(std residuals)), I get some very odd plots - particularly relative to good old plot()
. Also some very interesting errors. This seems like non-ideal behavior.
library(palmerpenguins)
library(performance)
mod <- lm(body_mass_g ~ sex, data = penguins)
check_model(mod, check = "linearity") |> plot()
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 3858.9
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 686.83
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 1.3398e-15
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 4.7173e+05
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
#> 3858.9
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
#> 686.83
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
#> number 1.3398e-15
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : There are other near
#> singularities as well. 4.7173e+05
check_model(mod, check = "homogeneity") |> plot()
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 3858.9
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 686.83
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 1.3398e-15
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 4.7173e+05
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
#> 3858.9
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
#> 686.83
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
#> number 1.3398e-15
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : There are other near
#> singularities as well. 4.7173e+05
Created on 2022-10-13 with reprex v2.0.2
Compare this to plot()
library(palmerpenguins)
library(performance)
mod <- lm(body_mass_g ~ sex, data = penguins)
plot(mod, which = 1)
Created on 2022-10-13 with reprex v2.0.2
The issue is trying to fit a smoother line to only two fitted values.
We should detect all categorical predictors for a model and then do something like raincloud/violin/boxplots instead
Agree. But, I'm curious why the points didn't show up properly as well.
They do, but the scale on the y axis is HUGE....
OH! It's due to the smoother. Ah. That makes sense. Is there a way to remove it?
Not currently.