rstanarm icon indicating copy to clipboard operation
rstanarm copied to clipboard

allow sigma to vary by group

Open jgabry opened this issue 9 years ago • 10 comments

For models with a Gaussian outcome and some form of grouping structure, allow sigma to vary by level of a group.

jgabry avatar Nov 09 '15 19:11 jgabry

Why only a Gaussian outcome? Why not all observation models with scale parameters?

avehtari avatar Feb 03 '16 17:02 avehtari

At the time the issue was opened I think only Gaussian was enabled. But yeah, we could do this for other distributions with scale parameters too.

On Wednesday, February 3, 2016, Aki Vehtari [email protected] wrote:

Why only a Gaussian outcome? Why not all observation models with scale parameters?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstanarm/issues/34#issuecomment-179359633.

jgabry avatar Feb 03 '16 17:02 jgabry

Before we can get this in rstanarm we need a new syntax for specifying this feature of a model.

What about an optional formula argument for sigma? For instance,

stan_glm(y ~ x1 + x2, sigma = ~ group)

On Wednesday, February 3, 2016, Jonah Sol Gabry [email protected] wrote:

At the time the issue was opened I think only Gaussian was enabled. But yeah, we could do this for other distributions with scale parameters too.

On Wednesday, February 3, 2016, Aki Vehtari <[email protected] javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

Why only a Gaussian outcome? Why not all observation models with scale parameters?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstanarm/issues/34#issuecomment-179359633.

jgabry avatar Feb 05 '16 06:02 jgabry

Are you saying here that the variance parameter will vary by group? In which case maybe we want something like:

stan_glm(formula1, formula2, …) so that it would look like stan_glm(y ~ x1 + x2, sigma ~ (1 | group))

On Feb 5, 2016, at 1:18 AM, Jonah Gabry [email protected] wrote:

Before we can get this in rstanarm we need a new syntax for specifying this feature of a model.

What about an optional formula argument for sigma? For instance,

stan_glm(y ~ x1 + x2, sigma = ~ group)

On Wednesday, February 3, 2016, Jonah Sol Gabry [email protected] wrote:

At the time the issue was opened I think only Gaussian was enabled. But yeah, we could do this for other distributions with scale parameters too.

On Wednesday, February 3, 2016, Aki Vehtari <[email protected] javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

Why only a Gaussian outcome? Why not all observation models with scale parameters?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstanarm/issues/34#issuecomment-179359633.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstanarm/issues/34#issuecomment-180217805.

andrewgelman avatar Feb 05 '16 06:02 andrewgelman

Yeah exactly. Something like that.

On Fri, Feb 5, 2016 at 1:20 AM, Andrew Gelman [email protected] wrote:

Are you saying here that the variance parameter will vary by group? In which case maybe we want something like:

stan_glm(formula1, formula2, …) so that it would look like stan_glm(y ~ x1

  • x2, sigma ~ (1 | group))

On Feb 5, 2016, at 1:18 AM, Jonah Gabry [email protected] wrote:

Before we can get this in rstanarm we need a new syntax for specifying this feature of a model.

What about an optional formula argument for sigma? For instance,

stan_glm(y ~ x1 + x2, sigma = ~ group)

On Wednesday, February 3, 2016, Jonah Sol Gabry [email protected] wrote:

At the time the issue was opened I think only Gaussian was enabled. But yeah, we could do this for other distributions with scale parameters too.

On Wednesday, February 3, 2016, Aki Vehtari <[email protected] javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

Why only a Gaussian outcome? Why not all observation models with scale parameters?

— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/rstanarm/issues/34#issuecomment-179359633>.

— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/rstanarm/issues/34#issuecomment-180217805>.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstanarm/issues/34#issuecomment-180218064.

jgabry avatar Feb 05 '16 06:02 jgabry

To prepare for the future, it's good to also think that there might be more output distribution families with other parameters than sigma. For example, people making survival analysis would love Weibull, then the parameters would be lambda (scale parameter) and k (shape parameter). Is is then easy to replace sigma with k and have stan_glm(y ~ x1 + x2, k ~ (1 | group), family=Weibull) ?

avehtari avatar Feb 05 '16 07:02 avehtari

Because the variable on the LHS of this optional formula will depend on the family, I think we should include it as an argument to rstanarm_family() rather than the modeling function (e.g. stan_glm itself).

We will be introducing rstanarm_family() in the next release as an alternative way of specifying the family that also allows the user to tweak family-specific hyperparameters in a cleaner way than in now possible. For example,

stan_glm(y ~ x, family = "gaussian")

will still be allowed and will use the default hyperparameters, whereas

stan_glm(y ~ x, family = rstanarm_family("gaussian", prior_scale_for_dispersion = 2))

allows the user to set the scale for the half-Cauchy prior on the dispersion (scale of the regression SE).

So we could also allow

rstanarm_family("gaussian", scale ~ (1|group)) 
rstanarm_family("t_family", scale ~ (1|group)) 
rstanarm_family("gamma", shape ~ (1|group))  
rstanarm_family("weibull", shape ~ (1|group))

etcetera.

jgabry avatar Feb 05 '16 18:02 jgabry

Just checking back--no updates on this, are there? I can of course just code it myself for my model in Stan, but I'm curious whether there are papers on this type of model and I'm not sure what to google (and eventually, cite for a paper). Do models with sigma ~ (1 | group) have a name?

flaxter avatar Jul 08 '18 13:07 flaxter

They are so called distributional models or models of "location, scale, and shape". You may fit them with brms which also uses Stan at the backend (see https://cran.r-project.org/web/packages/brms/vignettes/brms_distreg.html for examples).

paul-buerkner avatar Jul 08 '18 13:07 paul-buerkner

Has this feature been implemented in rstanarm yet?

Jaapro avatar Apr 09 '20 13:04 Jaapro