arkouda
arkouda copied to clipboard
Random number generation for non-uniform distributions
There has been user desire for generating random numbers with respect to other distributions (normal in particular). But we should also consider adding other common distributions (power law, binomial, poisson, etc)
Right now we use fillRandom
in chpl which doesn't seem to have the ability to specify distributions. I know there are ways of converting uniform into other distributions and bill did a power law conversion for the sort benchmarks. I'd rather not get too custom with it though since random number generation is very easy to mess up
I think we need to dive into how numpy does randomness to try and align ourselves with them as much as possible
- [x] #2993
- [x] #3008
- [x] #3017
- [x] #3066
- [x] #3167
- [x] #3183
- [x] #3372
- [x] #3373
- [ ] #3374
- [x] #3245
- [ ] #3846
here is the full list of functions supported by numpy's generators that have yet to be implemented, categorized according to how important they seem to me (i.e. do I recognize it from the single stats class I've taken?)
remaining np random methods
Prioritized:
- numpy.random.Generator.beta
- numpy.random.Generator.binomial
- numpy.random.Generator.chisquare
- numpy.random.Generator.laplace
- numpy.random.Generator.geometric
- numpy.random.Generator.logseries
- numpy.random.Generator.multinomial
- numpy.random.Generator.power
Not prioritized at the moment:
- numpy.random.Generator.permuted
- numpy.random.Generator.dirichlet
- numpy.random.Generator.f
- numpy.random.Generator.bit_generator
- numpy.random.Generator.spawn
- numpy.random.Generator.bytes
- numpy.random.Generator.gumbel
- numpy.random.Generator.hypergeometric
- numpy.random.Generator.multivariate_hypergeometric
- numpy.random.Generator.negative_binomial
- numpy.random.Generator.noncentral_chisquare
- numpy.random.Generator.noncentral_f
- numpy.random.Generator.pareto
- numpy.random.Generator.rayleigh
- numpy.random.Generator.standard_cauchy
- numpy.random.Generator.standard_t
- numpy.random.Generator.triangular
- numpy.random.Generator.vonmises
- numpy.random.Generator.wald
- numpy.random.Generator.weibull
- numpy.random.Generator.zipf
I found the c code where numpy defines it's distributions. It uses pcg as it's random number generators, same as chapel. So it feels like we could do something similar, but there's more digging to be done to tell how well this ports to chpl