StatsModels.jl icon indicating copy to clipboard operation
StatsModels.jl copied to clipboard

FormulaTerm type parameters

Open matthieugomez opened this issue 6 years ago • 3 comments
trafficstars

The FormulaTerm type is parametrized by the type of its fields. Is this really needed? Because of this, a function such as fit(x::MyModel, f::FormulaTerm) is recompiled for each possible term types in the RHS or LHS. I think this may lead to too much compilation.

matthieugomez avatar Nov 10 '19 22:11 matthieugomez

idk if it is needed or not. I do agree that it can become problematic. E.g. I know of people using StatsModels for large problems with say 3000 variables. which makes for huge types.

Short-term: You could change definition of fit(x::MyModel, f::FormulaTerm) to fit(x::MyModel, @nospecialize f::FormulaTerm) if that is a concern for you.

oxinabox avatar Nov 11 '19 11:11 oxinabox

Thanks for the tip — this is what I do for now. Still, it may be better to simply remove these type parameters.

matthieugomez avatar Nov 11 '19 11:11 matthieugomez

That's a good point. I'd originally put these type parameters on all the terms so that it would be possible to use @generated functions to produce very efficient methods for single-row modelcols, but I've not gotten around to that and it's probably a premature optimization. I'm also now not sure that @generated functions are really the way to go for that, and moreover even if they are I think all you'd need is the "width" of each term...

kleinschmidt avatar Nov 11 '19 14:11 kleinschmidt