xxxyyyzzz12345

Results 4 comments of xxxyyyzzz12345

In addition, oneflow.MinMaxObserver and oneflow.ones_like also have similar problems. By the way, is there any progress on implementing or fixing the gradient functions of these apis?

Thanks for your reply! But it seems that the derivative for floor_divide does exist when the divisor is a tensor: ``` x = oneflow.tensor([1.,2.,3.]).requires_grad_() y = oneflow.tensor([1.,1.,1.]) output = oneflow.floor_divide(x,y)...

Thanks for your reply! However, the results are still inconsistent if oneflow.sum() is not used. ``` count = 0 input = oneflow.rand(2, 72, 16,dtype=oneflow.float64).cuda() input_grad = input.clone().requires_grad_(True) for i in...

The same problem exists for oneflow.nn.init.ones_, oneflow.nn.init.zeros_.