keras-mdn-layer icon indicating copy to clipboard operation
keras-mdn-layer copied to clipboard

Check treatment of scale matrix vs covariance matrix in sampling procedure

Open cpmpercussion opened this issue 6 years ago • 1 comments
trafficstars

There could be an issue with sampling due to (my) confusion about standard deviation and variance.

The samples are drawn using numpy like so (documentation) (line 238 of __init__.py)

sample = np.random.multivariate_normal(mus_vector, cov_matrix, 1)

But the output from the mixture density layer are treated as scale variables in tfp.distributions.MultivariateNormalDiag. This notes that:

covariance = scale @ scale.T

Thus, it seems we should have been squaring the cov_matrix before putting it into the multivariate normal sampling procedure. This could explain why we end up having to scale down the sigma variable so much in real-world applications.

A todo here is to get a definite answer and do some test to try out what's going on.

cpmpercussion avatar Oct 30 '19 04:10 cpmpercussion

it seems to me that the scale vector should have been squared before using as a covariance matrix, so this is now the current behaviour.

It remains to write a test (going across tensorflow probability and numpy) that a tfd scale vector is actually going to produce the correct distributions.

cpmpercussion avatar Nov 04 '19 04:11 cpmpercussion