StatsModels.jl
StatsModels.jl copied to clipboard
FormulaTerm type parameters
The FormulaTerm type is parametrized by the type of its fields. Is this really needed?
Because of this, a function such as fit(x::MyModel, f::FormulaTerm) is recompiled for each possible term types in the RHS or LHS. I think this may lead to too much compilation.
idk if it is needed or not. I do agree that it can become problematic. E.g. I know of people using StatsModels for large problems with say 3000 variables. which makes for huge types.
Short-term: You could change definition of
fit(x::MyModel, f::FormulaTerm) to fit(x::MyModel, @nospecialize f::FormulaTerm)
if that is a concern for you.
Thanks for the tip — this is what I do for now. Still, it may be better to simply remove these type parameters.
That's a good point. I'd originally put these type parameters on all the terms so that it would be possible to use @generated functions to produce very efficient methods for single-row modelcols, but I've not gotten around to that and it's probably a premature optimization. I'm also now not sure that @generated functions are really the way to go for that, and moreover even if they are I think all you'd need is the "width" of each term...