returnn icon indicating copy to clipboard operation
returnn copied to clipboard

Some fix for invalid broadcasting

Open albertz opened this issue 1 year ago • 5 comments

Also see #1749.

albertz avatar Oct 20 '24 08:10 albertz

It turns out, in the RF native code, the broadcasting check was buggy and always allowed broadcasting. (It checked whether a.dims is a subset of all dims, and b.dims is a subset of all dims, which is obviously always true...)

This is problematic now. There is code out there which would break now when we fix this. So it means, we should introduce a new behavior version when we want to do the broadcasting check properly.

albertz avatar Oct 21 '24 08:10 albertz

And there is another thing where the RF native code was different: This time, I think it's not really a bug but just less strict. Without the native code, this RF code here is not allowed:

a = ...  # some int
b = ...  # some int
c = a / b

This gives:

ValueError: Dividing a Tensor of type int by an integer is disallowed. Please convert the Tensor to float.

But when the RF native code is used, this is allowed.

The question is, what to follow here now. We could just remove the check in the pure Python RF code, and then both are consistent. Otherwise this also needs a new behavior version.

albertz avatar Oct 21 '24 08:10 albertz

@NeoLegends What do you think?

  • For the broadcast check, I think it's clear, (right?), i.e. introduce a new behavior version, have the check disabled for the old behavior version, and only enabled for the new behavior version.
  • For the true-div int-check case, just remove the check?

albertz avatar Oct 21 '24 09:10 albertz

For the broadcast check, I think it's clear, (right?), i.e. introduce a new behavior version, have the check disabled for the old behavior version, and only enabled for the new behavior version.

Yes.

For the true-div int-check case, just remove the check?

Does the truediv result in a floating point tensor? Then I think we can remove the check and just keep "normal" python semantics on the tensors (i.e. isinstance(3 / 4, float) == True).

NeoLegends avatar Oct 21 '24 10:10 NeoLegends

Does the truediv result in a floating point tensor? Then I think we can remove the check and just keep "normal" python semantics on the tensors (i.e. isinstance(3 / 4, float) == True).

Yes. Or maybe it depends on the backend. For PyTorch it does. E.g. torch.tensor(1) / torch.tensor(2) results in tensor(0.5000). Maybe I was just afraid that other backends would behave differently, or not allow this, so just to be safe, and to keep RF independent from backend-specific behavior, disallow this.

In the native RF code, I think it just ignores the RF Tensor dtype. It does the raw combine op (truediv) and then just overtakes the dtype from the resulting raw tensor.

albertz avatar Oct 21 '24 10:10 albertz