probability icon indicating copy to clipboard operation
probability copied to clipboard

Sampling from a categorical distribution without replacement

Open Randl opened this issue 5 years ago • 2 comments

As discussed in https://github.com/tensorflow/tensorflow/issues/9260 this issue probably belongs here: Both tf.multinomial() and tf.contrib.distributions.Categorical.sample() allow to sample from a multinomial distribution. However, they only allow sampling with replacement.

In constrast, Numpy's numpy.random.choice() has a replace parameter that allows sampling without replacement. Would it be possible to add a similar functionality to TensorFlow?

In particular, the proposal is to implement it with Gumbel-max trick: https://github.com/tensorflow/tensorflow/issues/9260#issuecomment-408950922 Corresponding implementation in PyTorch: https://github.com/pytorch/pytorch/commit/af05158c56af29e062580f458a86a32b8f4c2b85

Randl avatar Jun 09 '20 07:06 Randl

TFP now has a random package you could add a pair of choice and stateless_choice functions there.

brianwa84 avatar Aug 25 '20 15:08 brianwa84

Hi, are there any updates on this?

Directly applying the "trick" above is very slow for large vectors, using something like torch.randperm(n)[:k] is about 20x times faster. (and we don't have randperm in tf, additionally 'randperm' may be slow if compared with numpy's choice algo if it was implemented for GPUs)

amitport avatar Dec 30 '21 11:12 amitport