UncertainData.jl icon indicating copy to clipboard operation
UncertainData.jl copied to clipboard

Adopt a common resampling interface?

Open kahaaga opened this issue 6 years ago • 1 comments

If a common resampling interface for the stats ecosystem is implemented (suggested in #24), the resampling methods in UncertainData should be able to take a first argument that is a point-estimator. The remaining argument ordering should be similar to Julia base, following the Julia style guide. Putting the function name first allows the use do blocks.

Thus, resample(estimator::Function, uval::AbstractUncertainValue, method::AbstractResampling) would be the go-to syntax when wanting to compute a resampled statistic from a point estimator over any uncertain value.

Having the method argument last allows to set a default resampling method. Currently, the resampling for the implemented types is just random resampling with replacement (the equivalent of BasicSampling in Bootstrap.jl. We're now using the n argument to specify the number of draws, but this could be replaced by an instance BasicSampling(n) instead.

So we'd define a constant const default_n = 10000, and define

  • resample(f::Function, uval, method = BasicSampling(default_nrun)), so you could call resample(mean, uval) if you want the resampled mean using the default values, and resample(mean, uval, BasicSampling(n) if you want to resample n times. Other bootstrapping approaches could be implemented for different types of uncertain values using multiple dispatch.

I'm not sure what the syntax would be for uncertain datasets. Let's see what happens with the common interface in #24 first.

kahaaga avatar Jan 25 '19 05:01 kahaaga

If the interface would require that the return type is a subtype of AbstractSampling, then we should implement a simple return type holding an array of resampled values supporting regular array indexing. This way, mathematical operations would still be supported without doing any changes.

kahaaga avatar Jan 25 '19 05:01 kahaaga