StatsModels.jl
StatsModels.jl copied to clipboard
remove mean/var/min/max
solves https://github.com/JuliaStats/StatsModels.jl/issues/222
Seems potentially quite breaking, as I imagine that some packages rely on being able to access the precomputed summary statistics for continuous terms.
Right, it should be in the 0.7
version.
That being said, do you have any example of an external package relying on these summary stats?
Nope. GLM or MixedModels might but I don't know for certain, which is why I said "potentially." Perhaps nothing is using it and it would be safe to remove, but we should just be aware of the breakage potential.
I have some internal code that uses this. It might not be strictly necessary, but ideally I'd like to make sure there's some kind of mechanism for custom term types to request summary stats to be extracted from the data table at schema creation time. That's currently only possible via the hints Dict which is fine for things where you've manually specified the special handling but wouldn't work for things like splines implemented as functions in the formula (although those don't work very well with the current system either since ideally you need more than just mean/var/min/max for that).
It's good to know that the schema is a performance bottleneck though.
Actually, another strategy that is actually used in #183 is to replace ContinuousTerm
in some cases with plain Term
s, which in that PR simply pass through the underlying values without changing. In the vast majority of cases this means that ContinuousTerm
can just be eliminated.