rstanarm
rstanarm copied to clipboard
Require 'data' argument?
I'm strongly in favor of this. Here are pros/cons:
Reason to require the data argument:
- Not requiring the
dataargument when specifying a model leads to all sorts of challenges for successfully implementing and maintaining the internals of rstanarm (or any other modeling package really).
Reasons not to require the data argument:
- Base R's
lm/glmand lme4'slmer/glmerset a very unfortunate precedent of not requiring this so we would be breaking with tradition if we were to require it. - @andrewgelman seems to like using the global environment but maybe we can convince him that the convenience is not worth the problems it introduces
+1 the global environment stuff is a nightmare. If you want to make people's life "easy" allow some environment to be used as the data argument and then people can pass .RGlobalEnv and handle their own scandal if it breaks everything.
I have recently introduced the requirement for data in brms, since, as you say, not doing so comes with a whole lot of problems. In particular, users may not realize that they named a variable incorrectly, when another variable is accidently taken from global env, instead. When requiring data, you can also improve error messages on which variables are missing and reduce uninformative error messages from stats::model.frame. Further 95%+ of users are passing something to data, anyway, so the change won't affect that many people.
@jgabry I'd prefer this too. Regardless to how things are implemented in a lot of models that rstanarm emulates, I think it's important to encourage best practices (e.g. requiring the user to declare their data in the function call).
@sakrejda @paul-buerkner @imadmali Thanks for the feedback. Glad you agree. Unless someone (@andrewgelman ?) has a strong argument for keeping the current behavior I think we should go ahead and make this change for the next release.
Enforce it even if Andrew disagrees.
On Wed, Feb 1, 2017 at 12:54 PM, Jonah Gabry [email protected] wrote:
@sakrejda https://github.com/sakrejda @paul-buerkner https://github.com/paul-buerkner @imadmali https://github.com/imadmali Thanks for the feedback. Glad you agree. Unless someone (@andrewgelman https://github.com/andrewgelman ?) has a strong argument for keeping the current behavior I think we should go ahead and make this change for the next release.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/155#issuecomment-276729927, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqlrjn4Ox3sCfgeIQpU-KPMbF15usks5rYMa6gaJpZM4L0KJc .
The relevant helper function if you're a die-hard would be:
aim_at_foot <- function() as.list(.GlobalEnv)
data = aim_at_foot()
Even then at least it's explicit!
There's now a PR (#156) introducing a deprecation warning if the data argument is omitted. Eventually we can make it an error instead of a warning.
The relevant helper function if you're a die-hard would be:
aim_at_foot <- function() as.list(.GlobalEnv) data = aim_at_foot()Even then at least it's explicit!
Hah, nice!
About time to start enforcing it for real?