performance icon indicating copy to clipboard operation
performance copied to clipboard

check_model() linearity & variance for categorical predictors

Open DominiqueMakowski opened this issue 1 month ago • 3 comments

For models with a categorical predictor (3 levels here):

star <- read.csv("https://drmankin.github.io/disc_stats/star.csv") |>
  dplyr::mutate(dplyr::across(dplyr::starts_with("star"), forcats::as_factor))

star_maths_lm <- lm(math2 ~ star2, data = star, na.action = na.exclude)

performance::check_model(star_maths_lm)
Image
plot(star_maths_lm, which = 1)
Image

The variance and linearity plots could arguably be more informative. In particular the one to compare the variance across the different groups.

In a way the very large CI masks the variability in the data points. But I'm not s ure what's the best way to adress it:

  • Add an option to simply drop the CI ribbon?
  • Have bespoke plots for categorical predictors?
  • Alternatives?

DominiqueMakowski avatar Nov 04 '25 10:11 DominiqueMakowski

Duplicate of https://github.com/easystats/performance/issues/642

DominiqueMakowski avatar Nov 04 '25 11:11 DominiqueMakowski

Sir @DominiqueMakowski I have raised a PR regarding this issue can you please review it and give your feedback

https://github.com/easystats/performance/pull/874

ANAMASGARD avatar Nov 14 '25 07:11 ANAMASGARD

As mentioned here, maybe we should first try to disable the confidence bands. I think this already makes the plot more readable, especially since other plotting functions also show the data points (but no CI). I think the CI are problematic here, not necessarily the data points (at least not for all plots).

strengejacke avatar Nov 14 '25 08:11 strengejacke