probability icon indicating copy to clipboard operation
probability copied to clipboard

[Feature Request] Zero-Inflated Poisson and Negative Binomial distributions

Open minaskar opened this issue 4 years ago • 13 comments

Are there any plans to add a Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) to TFP? Those are usually very common distributions in other packages, and it shouldn't be hard to implement.

minaskar avatar Oct 20 '20 11:10 minaskar

Hi @minaskar

If at all useful, I've coded up e.g. zero inflated Poisson stuff as a Mixture of a Deterministic and Poisson before. Something like this:

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions

zero_prob = 0.3
poisson_log_rate = 2.5

zero_inflated_poisson = tfd.Mixture(
    cat=tfd.Categorical(probs=[zero_prob, 1.0 - zero_prob]),
    components=[tfd.Deterministic(loc=0.0), tfd.Poisson(log_rate=poisson_log_rate)],
)

samples = zero_inflated_poisson.sample(1_000)

values, counts = np.unique(samples, return_counts=True)

plt.bar(values, counts)
plt.grid()
plt.show()

zero_inflated_poisson

jeffpollock9 avatar Oct 20 '20 12:10 jeffpollock9

Hi @jeffpollock9 ,

This looks very nice! How would that work as a layer? I tried the following but it doesn't work:

tfpl.DistributionLambda(
      make_distribution_fn=lambda t: tfd.Mixture(cat=tfd.Categorical(probs=[t[0], 1.0 - t[0]]),
components=[tfd.Deterministic(loc=0.0), tfd.Poisson(log_rate=t[1])],),
      convert_to_tensor_fn=lambda s: s.sample()
  )

minaskar avatar Oct 20 '20 12:10 minaskar

Consider instead using Categorical(logits=[0, t[0]]), assuming you have no activation function applied to the incoming tensor.

On Tue, Oct 20, 2020 at 8:54 AM Minas Karamanis [email protected] wrote:

Hi @jeffpollock9 https://github.com/jeffpollock9 ,

This looks very nice! How would that work as a layer? I tried the following but it doesn't work:

tfpl.DistributionLambda( make_distribution_fn=lambda t: tfd.Mixture(cat=tfd.Categorical(probs=[t[0], 1.0 - t[0]]),components=[tfd.Deterministic(loc=0.0), tfd.Poisson(log_rate=t[1])],), convert_to_tensor_fn=lambda s: s.sample() )

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/1134#issuecomment-712827623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI7KFFLTM3D5HGNEAL3SLWCADANCNFSM4SYE3YWQ .

brianwa84 avatar Oct 20 '20 13:10 brianwa84

@brianwa84

I'm getting an error message saying ValueError: Shapes must be equal rank, but are 0 and 1

minaskar avatar Oct 20 '20 13:10 minaskar

I'm not 100% sure as I don't use those layers, but I think you need to capture any batch dimensions in t:

tfpl.DistributionLambda(
    make_distribution_fn=lambda t: tfd.Mixture(
        cat=tfd.Categorical(logits=[t[..., 0], 0.0]),
        components=[
            tfd.Deterministic(loc=0.0),
            tfd.Poisson(log_rate=t[..., 1]),
        ],
    ),
    convert_to_tensor_fn=lambda s: s.sample(),
)

at least that appears to be the pattern in https://www.tensorflow.org/probability/examples/Probabilistic_Layers_Regression#case_4_aleatoric_epistemic_uncertainty

jeffpollock9 avatar Oct 20 '20 13:10 jeffpollock9

BTW if anyone wants to send a PR to add some zero-inflated discrete distributions, sampling and log_prob should not be too complicated. There might even be a case for a generic ZeroInflated(underlying, prob) meta-distribution.

brianwa84 avatar Oct 20 '20 13:10 brianwa84

@jeffpollock9 Yes, this is exactly what I tried next, still get the same error message.

@brianwa84 the log_prob has a closed form for both distributions so it shouldn't be very hard.

minaskar avatar Oct 20 '20 13:10 minaskar

Hey @brianwa84, if noone else already started working on this already, I would have a look and implement them. Cheers, Simon

dirmeier avatar Jun 28 '21 19:06 dirmeier

No one has started, feel free to have a go at it.

Brian Patton | Software Engineer | @.***

On Mon, Jun 28, 2021 at 3:26 PM Simon Dirmeier @.***> wrote:

Hey @brianwa84 https://github.com/brianwa84, if noone else already started working on this already, I would have a look and implement them. Cheers, Simon

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/1134#issuecomment-869962406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI76THN24BSUF7WWXWLTVDEFNANCNFSM4SYE3YWQ .

brianwa84 avatar Jun 28 '21 21:06 brianwa84

Hey guys! I might need zero-infl poisson as a loss asap and have this code so far which throws an error:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.random.set_seed(42)
y_true = tf.random.uniform((2, 100, 4), minval=0, maxval=2, dtype=tf.int32)
y_pred = tf.random.uniform((2, 100, 4), minval=0, maxval=1, dtype=tf.float32)
# multinomial part of loss function
rate = tf.math.exp(y_pred)
nonzero_prob = tf.math.divide(
    tf.cast(tf.math.count_nonzero(y_pred, axis=(1, 2)), tf.float32),
    tf.cast(tf.size(y_pred), tf.float32))
cat = tfd.Categorical(probs=tf.stack([1-nonzero_prob, nonzero_prob], -1))
components = [tfd.Deterministic(loc=tf.zeros_like(rate)), tfd.Poisson(rate)]
# Error here...
zip_dist = tfd.Mixture(cat=cat, components=components)

Any input on if this implementation is wrong or what could be causing the error would be very much appreciated! The error message is:

ValueError                                Traceback (most recent call last)
<ipython-input-72-ea5c1ffcaa5b> in <module>
    11 components = [tfd.Deterministic(loc=tf.zeros_like(rate)), tfd.Poisson(rate)]
    12 # Error here...
---> 13 zip_dist = tfd.Mixture(cat=cat, components=components)

<decorator-gen-281> in __init__(self, cat, components, validate_args, allow_nan_stats, use_static_graph, name)

~/tf_2/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py in wrapped_init(***failed resolving arguments***)
   274       # called, here is the place to do it.
   275       self_._parameters = None
--> 276       default_init(self_, *args, **kwargs)
   277       # Note: if we ever want to override things set in `self` by subclass
   278       # `__init__`, here is the place to do it.

~/tf_2/lib/python3.7/site-packages/tensorflow_probability/python/distributions/mixture.py in __init__(self, cat, components, validate_args, allow_nan_stats, use_static_graph, name)
   140         raise ValueError(
   141             "components[{}] batch shape must be compatible with cat "
--> 142             "shape and other component batch shapes".format(di))
   143       static_event_shape = tensorshape_util.merge_with(
   144           static_event_shape, d.event_shape)

ValueError: components[0] batch shape must be compatible with cat shape and other component batch shapes

shtoneyan avatar Jun 28 '21 22:06 shtoneyan

I think you have the batch dimension wrong for cat in the mixture:

zip = tfd.Mixture(cat = tfd.Categorical(probs=tf.stack([nonzero_prob, 1 - nonzero_prob], -1)),
                  components=[tfd.Deterministic(tf.zeros_like(rate)), tfd.Poisson(rate)])

at least has the right shape, but I might be parsing the batch and event shape of this problem wrong...

ColCarroll avatar Jun 29 '21 00:06 ColCarroll

I just edited the code but seem to be running into the same issue... thanks for the help though!

shtoneyan avatar Jun 29 '21 16:06 shtoneyan

@shtoneyan do you have any following updates about using tfp to do zero-inflated poisson?

ruiw-uber avatar Jun 27 '22 21:06 ruiw-uber