probability icon indicating copy to clipboard operation
probability copied to clipboard

Likely bug in `tfp.math.fill_triangular_inverse`

Open reichardtj opened this issue 2 years ago • 5 comments

I'm on tfp 0.16.0, python 3.8, windows10

Trying to extract a triangluar part of a matrix into a vector, I came across:

print(
    tfp.math.fill_triangular_inverse([
        [4, 0, 0],
        [6, 5, 0],
        [3, 2, 1]
    ])
)

print(
    tfp.math.fill_triangular_inverse([
        [4, 0, 0],
        [6, 5, 0],
        [3, 2, 1]
    ], upper=True)
)

which produces

tf.Tensor([1 2 3 4 5 6], shape=(6,), dtype=int32)
tf.Tensor([4 0 0 7 7 3], shape=(6,), dtype=int32)

and this makes absolutely no sense to me - where does the 7 come from and the 3 is clearly not part of the upper triangle....

reichardtj avatar Mar 09 '22 14:03 reichardtj

The way this routine is implemented, it assumes that the input is upper triangular if upper=True. The implementation relies on there being zeros in the lower non-diagonal portion of the input. It's written a bit obtusely, so it's not very clear. IIRC this was for performance reasons.

Can you say a bit more about what you're trying to do? There may be a better/safer way to get what you're after.

On Wed, Mar 9, 2022 at 9:31 AM reichardtj @.***> wrote:

I'm on tfp 0.16.0, python 3.8, windows10

Trying to extract a triangluar part of a matrix into a vector, I came across:

print( tfp.math.fill_triangular_inverse([ [4, 0, 0], [6, 5, 0], [3, 2, 1] ]) ) print( tfp.math.fill_triangular_inverse([ [4, 0, 0], [6, 5, 0], [3, 2, 1] ], upper=True) )

which produces

tf.Tensor([1 2 3 4 5 6], shape=(6,), dtype=int32) tf.Tensor([4 0 0 7 7 3], shape=(6,), dtype=int32)

and this makes absolutely no sense to me - where does the 7 come from and the 3 is clearly not part of the upper triangle....

— Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/1530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG2GKGGAFQX622GAFT3ITU7CY5LANCNFSM5QJW2KEQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

csuter avatar Mar 09 '22 15:03 csuter

Hm, I see - We'll I'm just trying to get the upper triangular part of a matrix into a vector, and from the docstring fill_triangluar_inverse seemed like surefire bet

Creates a vector from a (batch of) triangular matrix.

The vector is created from the lower-triangular or upper-triangular portion
depending on the value of the parameter `upper`.

I'm certain there is both another and a better way to do that, but this behavior is puzzeling.

reichardtj avatar Mar 09 '22 19:03 reichardtj

Maybe this?

import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

a = np.arange(16).reshape([4, 4]) + 1
print(a)

print(tf.linalg.band_part(a, -1, 0))
print(tfp.math.fill_triangular_inverse(tf.linalg.band_part(a, -1, 0)))

print(tf.linalg.band_part(a, 0, -1))
print(tfp.math.fill_triangular_inverse(tf.linalg.band_part(a, 0, -1), upper=True))

Result:

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]
tf.Tensor(
[[ 1  0  0  0]
 [ 5  6  0  0]
 [ 9 10 11  0]
 [13 14 15 16]], shape=(4, 4), dtype=int64)
tf.Tensor([16 15 14 13  1 11 10  9  5  6], shape=(10,), dtype=int64)
tf.Tensor(
[[ 1  2  3  4]
 [ 0  6  7  8]
 [ 0  0 11 12]
 [ 0  0  0 16]], shape=(4, 4), dtype=int64)
tf.Tensor([ 1  2  3  4 16  6  7  8 12 11], shape=(10,), dtype=int64)

csuter avatar Mar 09 '22 19:03 csuter

Sure, that would work - but it looks like a hack ;-) Another alternative is building a boolean mask as in https://stackoverflow.com/questions/41514722/convert-the-strictly-upper-triangular-part-of-a-matrix-into-an-array-in-tensorfl

But seeing the above code, bandpart and then fill_triagular_inverse, I'd always think the bandpart operation is superfluous.

I would definitely caution against the pitfall of taking the current documentation at face value.

reichardtj avatar Mar 09 '22 20:03 reichardtj

I agree the documentation can improve. The key insight, I think, is that "fill_triangular" takes a vector of length n * (n + 1) / 2 to a square matrix of shape n x n, with all zeros above (or below) the diagonal. An input to the inverse of this function must therefore also have zeros in the corresponding region or it is not in the domain of the function. (We could include assertions to validate the input, but TFP as a rule doesn't include assertions by default, since they can be costly in large vectorized computations so they'd need to be flag-controlled and off by default).

csuter avatar Mar 09 '22 20:03 csuter