NumSharp icon indicating copy to clipboard operation
NumSharp copied to clipboard

Implement numpy.random.choice

Open Plankton555 opened this issue 6 years ago • 9 comments

Generates a random sample from a given (possible weighted) 1-D array.

NumPy docs: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html

Plankton555 avatar Jun 27 '19 09:06 Plankton555

I needed this and implemented a subset of this functionality (random sampling based on weighted probabilities). I can clean up that code and upload here so that it maybe can work as a start for this functionality.

I've never contributed to an open-source project before though, so I might need some help when it comes to what must be implemented, and where in the architecture it should be located, and so on.

Plankton555 avatar Jun 27 '19 09:06 Plankton555

One of us (maybe @Nucs ?) can provide the method stub for you to fill in the code. You can then look at that commit to learn which files needed to be touched to add a new function.

henon avatar Jun 27 '19 10:06 henon

Please do! Let me know when that is done.

Plankton555 avatar Jun 28 '19 10:06 Plankton555

Sorry it took so long, it should be placed in NumSharp.Core/Random/np.random.choice.cs

/// <summary>
/// //todo
/// </summary>
/// <param name="arr">If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)</param>
/// <param name="shape"></param>
/// <param name="probabilities">The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.</param>
/// <remarks>https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html</remarks>
public NDArray choice(NDArray arr, Shape shape, double[] probabilities = null) {
    throw new NotImplementedException();
}        

/// <summary>
///  //todo
/// </summary>
/// <param name="a">If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)</param>
/// <param name="shape"></param>
/// <param name="probabilities">The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.</param>
/// <remarks>https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html</remarks>
public NDArray choice(int a, Shape shape, double[] probabilities = null) {
    throw new NotImplementedException();
}

Feel free to contact us via gitter if you get stuck or have a question

Nucs avatar Jun 30 '19 20:06 Nucs

Working on this in https://github.com/Plankton555/NumSharp/tree/feature/np_random_choice

Plankton555 avatar Jul 02 '19 14:07 Plankton555

One of the examples in the numpy docs looks like

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'],
      dtype='|S11')

Since the numsharp method signature either takes an integer or an NDArray I tried implementing this string[] list with an NDArray. This gives an exception (which maybe should be reported as a bug or nonimplemented feature since numpy arrays can take strings).

NDArray aa_milne_arr = new string[] { "pooh", "rabbit", "piglet", "Christopher" }; // throws System.NotImplementedException: implicit operator NDArray(Array array)

In this particular case I can solve it in some ways:

  1. Let this error happen until the NDArray has support for strings.
  2. Have another method signature which takes an array/enumerable of some sort. This could be reasonable since the numpy docs explicitly states that np.random.choice "Generates a random sample from a given 1-D array", which should be possible to represent as an enumerable?

Any thoughts?

Plankton555 avatar Jul 04 '19 13:07 Plankton555

@Nucs Do you have any plan on String support?

Oceania2018 avatar Jul 04 '19 13:07 Oceania2018

Pull request at https://github.com/SciSharp/NumSharp/pull/310

Plankton555 avatar Jul 04 '19 14:07 Plankton555

This issue is still open because np.random.choice does not support multi-dimensions.

Nucs avatar Oct 05 '19 16:10 Nucs