mitsuba3 Bug in Adam for non-real parameters

Hi,

There appears to be an edge case in the current Adam implementation. If the optimizer contains non-real variables, i.e., quaternions or complex numbers, the variance estimation might not be what we want.

The Adam implementation computes

v_t = self.beta_2 * v_tp + (1 - self.beta_2) * dr.sqr(g_p)

where g_p is the parameter gradient. If a parameter is a quaternion or complex number, the gradient will be of the same type. Then, dr.sqr will compute a quaternion or complex number product. This is different from the element-wise product that we would expect here.

While there is some literature on this subject (pytorch github issue, pytorch docs, https://arxiv.org/pdf/0906.4835.pdf), the easiest practical solution is to just optimize using Vector2f or Vector4f instead.

Maybe Mitsuba should raise an error if a quaternion or complex value is assigned to the optimizer?

Jan 25 '24 14:01 dvicini

Hi @dvicini

I agree with you. My suggestion would be a check to see if the type is special (dr.is_special_v()) and if it is the case raise an error as you said.

I'll leave this open for a few more days before pushing a fix, in case anyone else has some ideas to contribute :)

Jan 29 '24 09:01 njroussel

The same issue would also appear with camera matrices and, e.g., complex IOR values in the next version.

The optimizer will need an extra check like

tp = type(g_p)
if dr.is_special_v(tp):
    tp_a = dr.array_t(tp)
    g_p = tp_a(tp)
    ...
v_t = self.beta_2 * v_tp + (1 - self.beta_2) * dr.sqr(g_p)

Jan 29 '24 18:01 wjakob

We could either wait until the next version or backport the dr.array_t type trait.

Jan 29 '24 18:01 wjakob

This has now been fixed in #1423

Dec 04 '24 10:12 rtabbara