theseus icon indicating copy to clipboard operation
theseus copied to clipboard

Vectorization can give incorrect results if similar costs depend on values not declared as optim/aux vars

Open luisenp opened this issue 1 year ago • 4 comments

🐛 Bug

Steps to Reproduce

Here is a simple repro

import theseus as th
import torch
from theseus.core.vectorizer import Vectorize


class BadCost(th.CostFunction):
    def __init__(self, x, flag=False, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])

    def error(self) -> torch.Tensor:
        return (
            torch.ones_like(self.x.tensor)
            if self.flag
            else torch.zeros_like(self.x.tensor)
        )

    def jacobians(self):
        raise NotImplementedError

    def dim(self) -> int:
        return self.x.dof()

    def _copy_impl(self, new_name=None):
        return BadCost(self.x.copy(), flag=self.flag, name=new_name)  # type: ignore


o = th.Objective()
z = th.Vector(1, name="z")
o.add(BadCost(z, flag=True))
o.add(BadCost(z, flag=False))
o.update()  # to resolve batch size for vectorization
Vectorize(o)
print(o.error())

Expected behavior

The above prints tensor([[1., 1.]]) but it should print tensor([[1., 0.]]).

luisenp avatar May 03 '23 21:05 luisenp

Oh. So temporarily, is there any approach that makes the flag function well? I use a similar flag in many cases, but I haven't noticed this issue

xphsean12 avatar May 25 '23 03:05 xphsean12

For the now the two easiest workarounds, although neither is ideal are:

  • If you are not time constrained, turn vectorization off when you create the TheseusLayer, or
  • Wrap the flag as an auxiliary variable, so that it's also vectorized. For example,
class NoLongerBadCost(th.CostFunction):
    def __init__(self, x: th.Vector, flag: th.Variable, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])
        self.register_aux_vars(["flag"])

    def error(self) -> torch.Tensor:
        return (
            torch.where(
                self.flag.tensor, 
                torch.ones_like(self.x.tensor)), 
                torch.zeros_like(self.x.tensor),
            )
        )

I'm hoping I can add a fix before the end of the half, although, looking at the code above I'm now realizing there might not be a one-size-fits-all solution, since the correct logic probably depends on each particular cost function. I'll need to think about this.

luisenp avatar May 25 '23 17:05 luisenp

For the now the two easiest workarounds, although neither is ideal are:

  • If you are not time constrained, turn vectorization off when you create the TheseusLayer, or
  • Wrap the flag as an auxiliary variable, so that it's also vectorized. For example,
class NoLongerBadCost(th.CostFunction):
    def __init__(self, x: th.Vector, flag: th.Variable, name=None):
        super().__init__(th.ScaleCostWeight(1.0), name=name)
        self.x = x
        self.flag = flag
        self.register_optim_vars(["x"])
        self.register_aux_vars(["flag"])

    def error(self) -> torch.Tensor:
        return (
            torch.where(
                self.flag.tensor, 
                torch.ones_like(self.x.tensor)), 
                torch.zeros_like(self.x.tensor),
            )
        )

I'm hoping I can add a fix before the end of the half, although, looking at the code above I'm now realizing there might not be a one-size-fits-all solution, since the correct logic probably depends on each particular cost function. I'll need to think about this.

Thanks for your reply. For the first point, I want to know how to turn the vectorization off?

xphsean12 avatar Jun 01 '23 03:06 xphsean12

You can do

layer = th.TheseusLayer(optimizer, vectorize=False)

If you are using an optimizer w/o a TheseusLayer (not the recommended way), then it's off by default.

luisenp avatar Jun 01 '23 15:06 luisenp