biodiverse icon indicating copy to clipboard operation
biodiverse copied to clipboard

Randomisations - add subsampling option

Open GoogleCodeExporter opened this issue 9 years ago • 2 comments

A useful randomisation, which is essentially a cross-validation approach, is to 
randomly delete some subset of labels from the cloned basedata, and then assess 
how stable the analysis results are.


Original issue reported on code.google.com by shawnlaffan on 1 Mar 2011 at 1:38

GoogleCodeExporter avatar Mar 27 '15 22:03 GoogleCodeExporter

Thanks, Shawn!

Original comment by [email protected] on 1 Mar 2011 at 3:05

  • Added labels: ****
  • Removed labels: ****

GoogleCodeExporter avatar Mar 27 '15 22:03 GoogleCodeExporter

This could be done ~~using a multinomial sampler approach, probably called~~ in _get_randomised_basedata to generate a new basedata to pass on to the randomisation function. That way we can subsample and then apply shuffling if needed.

An optimisation for rand_nochange is to check if we are using the subsampled copy and return it instead of cloning another copy.

shawnlaffan avatar May 13 '19 06:05 shawnlaffan