probability
probability copied to clipboard
Likely bug in `tfp.math.fill_triangular_inverse`
I'm on tfp 0.16.0, python 3.8, windows10
Trying to extract a triangluar part of a matrix into a vector, I came across:
print(
tfp.math.fill_triangular_inverse([
[4, 0, 0],
[6, 5, 0],
[3, 2, 1]
])
)
print(
tfp.math.fill_triangular_inverse([
[4, 0, 0],
[6, 5, 0],
[3, 2, 1]
], upper=True)
)
which produces
tf.Tensor([1 2 3 4 5 6], shape=(6,), dtype=int32)
tf.Tensor([4 0 0 7 7 3], shape=(6,), dtype=int32)
and this makes absolutely no sense to me - where does the 7 come from and the 3 is clearly not part of the upper triangle....
The way this routine is implemented, it assumes that the input is upper
triangular if upper=True
. The implementation relies on there being zeros
in the lower non-diagonal portion of the input. It's written a bit
obtusely, so it's not very clear. IIRC this was for performance reasons.
Can you say a bit more about what you're trying to do? There may be a better/safer way to get what you're after.
On Wed, Mar 9, 2022 at 9:31 AM reichardtj @.***> wrote:
I'm on tfp 0.16.0, python 3.8, windows10
Trying to extract a triangluar part of a matrix into a vector, I came across:
print( tfp.math.fill_triangular_inverse([ [4, 0, 0], [6, 5, 0], [3, 2, 1] ]) ) print( tfp.math.fill_triangular_inverse([ [4, 0, 0], [6, 5, 0], [3, 2, 1] ], upper=True) )
which produces
tf.Tensor([1 2 3 4 5 6], shape=(6,), dtype=int32) tf.Tensor([4 0 0 7 7 3], shape=(6,), dtype=int32)
and this makes absolutely no sense to me - where does the 7 come from and the 3 is clearly not part of the upper triangle....
— Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/1530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG2GKGGAFQX622GAFT3ITU7CY5LANCNFSM5QJW2KEQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hm, I see - We'll I'm just trying to get the upper triangular part of a matrix into a vector, and from the docstring fill_triangluar_inverse
seemed like surefire bet
Creates a vector from a (batch of) triangular matrix.
The vector is created from the lower-triangular or upper-triangular portion
depending on the value of the parameter `upper`.
I'm certain there is both another and a better way to do that, but this behavior is puzzeling.
Maybe this?
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
a = np.arange(16).reshape([4, 4]) + 1
print(a)
print(tf.linalg.band_part(a, -1, 0))
print(tfp.math.fill_triangular_inverse(tf.linalg.band_part(a, -1, 0)))
print(tf.linalg.band_part(a, 0, -1))
print(tfp.math.fill_triangular_inverse(tf.linalg.band_part(a, 0, -1), upper=True))
Result:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
tf.Tensor(
[[ 1 0 0 0]
[ 5 6 0 0]
[ 9 10 11 0]
[13 14 15 16]], shape=(4, 4), dtype=int64)
tf.Tensor([16 15 14 13 1 11 10 9 5 6], shape=(10,), dtype=int64)
tf.Tensor(
[[ 1 2 3 4]
[ 0 6 7 8]
[ 0 0 11 12]
[ 0 0 0 16]], shape=(4, 4), dtype=int64)
tf.Tensor([ 1 2 3 4 16 6 7 8 12 11], shape=(10,), dtype=int64)
Sure, that would work - but it looks like a hack ;-) Another alternative is building a boolean mask as in https://stackoverflow.com/questions/41514722/convert-the-strictly-upper-triangular-part-of-a-matrix-into-an-array-in-tensorfl
But seeing the above code, bandpart and then fill_triagular_inverse, I'd always think the bandpart operation is superfluous.
I would definitely caution against the pitfall of taking the current documentation at face value.
I agree the documentation can improve. The key insight, I think, is that "fill_triangular" takes a vector of length n * (n + 1) / 2
to a square matrix of shape n x n
, with all zeros above (or below) the diagonal. An input to the inverse of this function must therefore also have zeros in the corresponding region or it is not in the domain of the function. (We could include assertions to validate the input, but TFP as a rule doesn't include assertions by default, since they can be costly in large vectorized computations so they'd need to be flag-controlled and off by default).