pytensor icon indicating copy to clipboard operation
pytensor copied to clipboard

ENH: allow val argument in fill_diagonal to be array-valued

Open fonnesbeck opened this issue 2 years ago • 1 comments

Before

Currently the val argument in fill_diagonal only accepts scalar values:

fill_diagonal(a, val)

For example,

fill_diagonal(pt.ones(5), 3)

will fill the diagonal element with 3.

After

This function should be allowed to set different values for each diagonal element, for example:

fill_diagonal(pt.ones(5), np.arange(5))

Context for the issue:

In the context of covariance matrices, one may want to set diagonal elements to different values for each component of a multivariate normal, for example.

fonnesbeck avatar Sep 01 '23 02:09 fonnesbeck

The trickiest part of this is the gradient with respect to the fill value array. fill_diagonal has a strange raveling and repeating behavior so that the following graphs are all valid:

import numpy as np

a = np.zeros((3, 3))
for val_shape in [(), (1,), (2,), (3,), (4,), (2, 3)]:
  val = np.random.normal(size=val_shape)
  np.fill_diagonal(a, val)

That means a correct gradient wrt to val would need to know how many times each entry in val was actually used in fill_diagonal.

One alternative is to restrict the behavior of fill_diagonal to allow broadcasting or exact shape matching, by instead writing it as

if a.ndim == 2:
  idx = pt.arange(0, pt.min(a.shape))
else:
  idx = pt.arange(0, pt.max(a.shape))
idxs = (idx,) * a.ndim
a = pt.set_subtensor(a[idxs], val)

But then fill_diagonal will probably not work in Numba?

ricardoV94 avatar Jan 02 '24 12:01 ricardoV94