probability Implemention of Mean Method in distributions.Categorical

Proposition for correcting issue #685 Error with Implementation of Mean Method in distributions.Categorical

NotImplementedError: mean is not implemented: Categorical

import tensorflow_probability as tfp

prob_dist = tfp.distributions.Categorical(probs=[1.0])
print(prob_dist.mean())

Defined a _mean method for implementing mean

Aug 23 '21 17:08 PavanKishore21

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

Aug 23 '21 17:08 googlebot

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

Aug 23 '21 17:08 google-cla[bot]

@googlebot I signed it!

Aug 23 '21 17:08 PavanKishore21

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Aug 23 '21 17:08 google-cla[bot]

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Aug 23 '21 17:08 googlebot

This implementation is not correct. The mean of a categorical is defined as sum(i * prob(x=i) for i in range(num_categories). You should be able to modify the implementation of the _entropy to implement this.

This will also need tests to verify the implementation is correct (entropy tests, again, should provide a good set of test cases).

Aug 23 '21 19:08 SiegeLordEx

Hello @SiegeLordEx, Thanks for your response. I have updated the formula for Mean

tf.reduce_sum(tf.math.multiply_no_nan(i , self._probs(x=i)) for i in range(self._num_categories), axis=-1

Check this out

Aug 25 '21 08:08 PavanKishore21

The previous implementation was incorrect. I've written an example implementation of how the Categorical mean should be, the code below works for me and produces the desired result. The mean should also account for unnormalised probabilities between the classes, since there are cases where the probabilities might not sum up to 1.

  def _mean(self):
    probs = self.probs_parameter()
    num_categories = self._num_categories(probs)
    # Initialise the mean with zeros
    categorical_mean = tf.zeros(tf.shape(probs[...,0]))
    # Compute the normalisation factors such that all probabilities sum up to 1
    normalisation = tf.reduce_sum(probs, axis=-1)
    # Sum up all the normalised probabilities
    # sum(i * prob(X=i) for i in range(num_categories) )
    for i in range(num_categories):
    	categorical_mean = categorical_mean + tf.cast(i,probs.dtype) * probs[...,i] / normalisation
    
    return categorical_mean

Shall I make a new commit/pull request with this?

May 05 '23 13:05 fotisdr

I don't think there is currently an implementation.

The following might be simpler: tf.reduce_sum(tf.range(self._num_categories(probs)) * probs, axis=-1) / tf.reduce_sum(probs, axis=-1)

You could send a PR, sure.

On Fri, May 5, 2023 at 9:58 AM Fotis Drakopoulos @.***> wrote:

The previous implementation was incorrect. I've written an example implementation of how the Categorical mean should be like, the code below works for me and produces the desired result. The mean should also account for unnormalised probabilities between the classes, since there are cases where the probabilities don't sum up to 1.

def _mean(self): probs = self.probs_parameter() num_categories = self._num_categories(probs) # Initialise the mean with zeros categorical_mean = tf.zeros(tf.shape(probs[...,0])) # Compute the normalisation factors such that all probabilities sum up to 1 normalisation = tf.reduce_sum(probs, axis=-1) # Sum up all the normalised probabilities # sum(i * prob(X=i) for i in range(num_categories) ) for i in range(num_categories): categorical_mean = categorical_mean + tf.cast(i,probs.dtype) * probs[...,i] / normalisation
return categorical_mean
Shall I make a new commit for this?

— Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/pull/1411#issuecomment-1536303865, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI3YPLRHUW55BLBNKCDXEUBRPANCNFSM5CVAO7SQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

May 08 '23 15:05 brianwa84

I don't think there is currently an implementation. The following might be simpler: tf.reduce_sum(tf.range(self._num_categories(probs)) * probs, axis=-1) / tf.reduce_sum(probs, axis=-1) You could send a PR, sure.

Hi,

Indeed this looks much simpler, thanks! We want to ensure that the multiplication is applied across the last axis of probs, i.e. each slice of probs across the last dimension (probs[...,i]) gets multiplied by each number i in tf.range(self._num_categories(probs)). The multiplication should be doing this by default, but I will test the code and submit a new PR.

May 08 '23 15:05 fotisdr