TuringGLM.jl
TuringGLM.jl copied to clipboard
use intercept traits w/ `apply_schema` to handle implicit intercept
StatsModels provides the notion of "intercept traits" https://github.com/JuliaStats/StatsModels.jl/blob/master/src/traits.jl to control the behavior of the automatic intercept adding that normally happens in GLMs. This package is a great example of a model type with an ~~"implicit intercept"~~ "dropped intercept", since everything is centered (AFAICT) so the intercept is always zero and shouldn't be included.
I think what you'd want to do is to define a model type (if there isn't one already) and add methods for the appropriate types.
I am trying to have the least middleware possible between @formula
and Turing.jl
.
What would be the added benefits?
There's some tricky business with how the model matrix is constructed given categorical variables and interactions/intercepts. If you have something like y ~ 1 + x
where x has say, 3 unique levels, usually two columns will be generated for x
and one for the intercept. If you do y ~ x
, an intercept is usually considered to be present implicitly, so you get teh same thing. But for y ~ 0 + x
, the intercept is suppressed and x
is "promoted" to full rank, so there will be three columns generated for x
(full dummy coding).
Same thing happens with interactions and main effects.
Anyway, it's rare this comes up in user code but it does happen sometimes
In principle, you can dispatch on DPPL models since a couple of months (the purpose of which was exactly to be able to define traits):
julia> @model function m1()
d ~ filldist(DiscreteUniform(1, 3), 3)
return d
end
m1 (generic function with 2 methods)
julia> trait(::Model{typeof(m1)}) = true
trait (generic function with 1 method)
julia> trait(m1())
true
You'd only need to move the closures out of turing_model
to get access to the function.
We currently don't implement/extend the StatisticalModel
type from StatsModels.jl
.
So I think that the traits are to be applied to those types.
I am closing this, we might revisit this in the future if we need to use StatisticalModel
.