HypothesisTests.jl
HypothesisTests.jl copied to clipboard
Interface for DataFrames?
I was wondering if there is any interest in having a way to perform hypotheses tests directly on Dataframes? Recently, I found myself wanting this for the Kruskal-Wallis test, and I put together this:
function KruskalWallisTest(df::DataFrame, values::Symbol, group::Symbol)
ugroups = unique(df[group])
ngroups = length(ugroups)
sort!(ugroups)
groups = Array(Array{eltype(df[values]),1},ngroups)
for (i,g) in enumerate(ugroups)
_idx = df[group].==g
groups[i] = df[values][_idx]
end
KruskalWallisTest(groups...)
end
It seems to me this is something that would be useful to have, but perhaps it should go into its own package so that we don't have to introduce DataFrames as a dependency?
A lot of packages could benefit from integration with DataFrames, I don't think we want to add DataFrames as a dependency everywhere. Optional dependencies, whenever they may happen, will likely be the best bet.