scikit-stan icon indicating copy to clipboard operation
scikit-stan copied to clipboard

Use `_glm` functions when applicable.

Open WardBrian opened this issue 2 years ago • 4 comments

For example, normal_id_glm, or bernoulli_logit_glm. These are much faster due to reduced autodiff.

We may need to re-factor how our models are structured to best use these. Also, these assume a separate intercept, so we will need to avoid using X with a column of all ones/disable when using fit_intercept=False

WardBrian avatar Sep 27 '22 16:09 WardBrian

The column-of-1s trick shortens the notation, but it's more efficient to not multiply by 1 and just use alpha + x * beta. Also, it allows us to put a broader prior on alpha than on the beta[k], which is important if the data is offset from 0 (literally in linear regression, which becomes 0.5 in a logistic regression, etc.)

You usually want a broader prior on the intercept than on coefficients, too, because it soaks up all the excess from all the other effects.

Is the fit_intercept = False thing from scikit-learn? Unless your y variables are standardized, you usually want an intercept, so I would strongly discourage people from this option in the doc.

bob-carpenter avatar Sep 29 '22 11:09 bob-carpenter

fit_intercept=False is a feature in some of the other scikit-learn estimators. The recommended usage is not to actually not fit an intercept, but rather for instances where your design matrix includes the all-1s column. In particular, most of the Python libraries which support the Wilkinson formula syntax produce X with a column of 1s by default.

WardBrian avatar Sep 29 '22 12:09 WardBrian

recommended usage is not to actually not fit an intercept, but rather for instances where your design matrix includes the all-1s column.

That makes more sense. I had to look up that Wilkinson formula syntax is what lme4 and brms use. Wilkinson was also behind the grammar of graphics which is the basis for ggplot2.

If the column of 1s is going to be forced on you, do you at least know which column it is so that you can define separate priors for intercepts?

bob-carpenter avatar Sep 29 '22 12:09 bob-carpenter

I think the convention is that the column of 1s comes first. When talking to @jgabry he mentioned that RStanArm checks for this and removes that first column, preferring to use the true intercept parameter.

In the long term, I'd like to make a package using a formula implementation like formulae and scikit-stan as a "backend" to make something which looks very much like rstanarm. In that package, I am planning on doing that same chop, but for the lower-level interface we wanted to allow the customization if necessary.

WardBrian avatar Sep 29 '22 13:09 WardBrian