tensorflow_probability: bfgs_minimize requires the elements in initial_position to have the same shape

Open cschill2020 opened this issue 10 months ago • 1 comments

This line: https://github.com/tensorflow/probability/blob/main/tensorflow_probability/python/optimizer/bfgs.py#L210 fails for input of the form: initial_position = [0., tf.zeros(10)]

I am working through some simple toy examples and have a JointDistribution of the form:

def make_joint_dist(matrix):
    def joint_dist():
        intercept = yield tfd.Normal(loc=0.0, scale=1.0, name="intercept")
        coefficients = yield tfd.Normal(loc=tf.zeros(10), scale=1.0, name="coefficients")
        yield tfd.Normal(
            loc=intercept + tfl.matvec(matrix, coefficients),
            scale=1.0,
            name="observations",
        )

    return tfd.JointDistributionCoroutineAutoBatched(joint_dist)

I am trying to run bfgs_minimize to find the MAP (minimize -joint_dist.log_prob(params)). However, in this case, I am cannot figure out how to pass an input_position for the coefficients. The code fails at L210 above: Shapes of all inputs must match: values[0].shape = [] != values[1].shape = [10]

I can put together a running example colab, but wanted to kick this off while I clean up my notebook....

Mar 31 '25 07:03 cschill2020

I was able to resolve this by flattening the initial postion and re-packing to a joint-distribution input in a custom neg_logp function. It required some quite some finicky operations, so not sure if it is worth keeping this open. Ill take a deeper dive to try to understand if there is a potential generalization that can be made to enable tuple of tensors as input.

Apr 01 '25 05:04 cschill2020