reinforcement-learning
reinforcement-learning copied to clipboard
Potential bug in tf.contrib.distributions.Normal
Hi Denny,
Recently I'm working on continuous control reinforcement learning task.
I fillowed the steps in Continuous MountainCar Actor Critic Solution to construct PolicyEstimator().
However the log probability of self.normal_dist.log_prob()
become positive when self.mu
become a small value (<0.2).
I'm wondering that if this is the bug of Tensorflow it self since they calculate the pdf by
f(x) = sqrt(1/(2*pi*sigma^2)) exp(-(x-mu)^2/(2*sigma^2))
.
Did you face the same problem while implementing the policy?
Best, James
Hm, interesting. I do remember having some probems with the policy, but it worked most of the time, so I didn't really look it it. I recommend you file a bug with Tensorflow.
That's not a bug.
Because normal distribution is a continuous probability distribution, self.normal_dist.prob()
actually means the probability density function (pdf) and it can be any value as long as it's greater than 0. So don't be surprised if you got positive value when you call self.normal_dist.log_prob()
.
@JamesChuanggg
A distribution should retrieve a value between 0 and 1. Log of that value should always be negative. How could you get a positive value from self.normal_dist.log_prob( )
method?
@botonchou You are talking about the sample point. They can be any real number. However, we are talking about (log) probability.
@huiwenzhang
I believe your assumption is false, if I correctly understand what you mean by distribution.
The values of a probability density function are not necessarily less than 1. (They are not probabilities.)
They can be greater than 1 when the mass is concentrated around a few values. For example, the probability density function of the normal distribution with standard deviation 0.25 will have values greater than 1 over the interval [-0.225, +0.225].
Hope this helps.