UncertainData.jl
UncertainData.jl copied to clipboard
Adopt a common resampling interface?
If a common resampling interface for the stats ecosystem is implemented (suggested in #24), the resampling methods in UncertainData
should be able to take a first argument that is a point-estimator. The remaining argument ordering should be similar to Julia base, following the Julia style guide. Putting the function name first allows the use do
blocks.
Thus, resample(estimator::Function, uval::AbstractUncertainValue, method::AbstractResampling)
would be the go-to syntax when wanting to compute a resampled statistic from a point estimator over any uncertain value.
Having the method
argument last allows to set a default resampling method. Currently, the resampling for the implemented types is just random resampling with replacement (the equivalent of BasicSampling
in Bootstrap.jl. We're now using the n
argument to specify the number of draws, but this could be replaced by an instance BasicSampling(n)
instead.
So we'd define a constant const default_n = 10000
, and define
-
resample(f::Function, uval, method = BasicSampling(default_nrun))
, so you could callresample(mean, uval)
if you want the resampled mean using the default values, andresample(mean, uval, BasicSampling(n)
if you want to resamplen
times. Other bootstrapping approaches could be implemented for different types of uncertain values using multiple dispatch.
I'm not sure what the syntax would be for uncertain datasets. Let's see what happens with the common interface in #24 first.
If the interface would require that the return type is a subtype of AbstractSampling
, then we should implement a simple return type holding an array of resampled values supporting regular array indexing. This way, mathematical operations would still be supported without doing any changes.