bayesplot icon indicating copy to clipboard operation
bayesplot copied to clipboard

`ppc_error_scatter_avg_vs_x`, has unstable residuals when the noise distribution has heavy tails; needs median not mean?

Open kruschke opened this issue 7 months ago • 2 comments

brms::pp_check(type = "error_scatter_avg_vs_x"), which calls bayesplot::ppc_error_scatter_avg_vs_x, has residuals that vary wildly from one run to the next when the noise distribution (family) is a Student $t$ distribution with small nu (aka df).

I suspect the problem is caused by the "average" being computed as the mean, which gets wildly distorted by outliers generated by the kurtotic noise distribution. I suspect the problem would be greatly ameliorated if the average could instead be computed as the median. But there seems to be no option for this, while some of the brms functions do have an argument robust = TRUE.

The linked HTML (and .Rmd) file has two examples, identical to each other except the first example uses data from a highly kurtotic $t$ distribution, while the second example used data from an essentially normal $t$ distribution. HTML: https://drive.google.com/file/d/1aGwO9i7RuXQoVVzkHkV4wC02uqKs43ys/view?usp=drive_link .Rmd: https://drive.google.com/file/d/1ksRr_2VOgMornK3s_Dk5sA0xoKPOywRA/view?usp=drive_link

kruschke avatar May 13 '25 18:05 kruschke

Thanks for the issue. (Fan of your book with the dogs!)

I submitted a patch #349 to let users override the mean function via fun_avg argument.

tjmahr avatar May 13 '25 20:05 tjmahr

Thanks @kruschke, I agree we should provide an option to for using other functions for averaging. Thanks @tjmahr for the PR. Will review soon.

jgabry avatar May 14 '25 17:05 jgabry