oneflow
oneflow copied to clipboard
[inplace相关] += 和clamp_在处理切片矩阵时与torch结果不一致
如题
最小复现代码:
>>> oneflow.__version__
'0.8.0.dev20220705+cu112'
import torch
import oneflow as flow
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
# BUG: inplace
x_torch[:,:2] += x_torch[:,4:]
x_flow[:,:2] += x_flow[:,4:]
# False
print((x_torch.numpy()==x_flow.numpy()).all())
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
x_torch += x_torch
x_flow += x_flow
# True
print((x_torch.numpy()==x_flow.numpy()).all())
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
x_torch[:,:2] = x_torch[:,:2] + x_torch[:,4:]
x_flow[:,:2] = x_flow[:,:2] + x_flow[:,4:]
# True
print((x_torch.numpy()==x_flow.numpy()).all())
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
x_torch.clamp_(min=0, max=1)
x_flow.clamp_(min=0, max=1)
# True
print((x_torch.numpy()==x_flow.numpy()).all())
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
# BUG: inplace
x_torch[:, :2].clamp_(min=0, max=1)
x_flow[:, :2].clamp_(min=0, max=1)
# False
print((x_torch.numpy()==x_flow.numpy()).all())
x_torch = torch.randn(5,5)
x_flow = flow.tensor(x_torch.numpy())
x_torch[:, :2].clamp(min=0, max=1)
x_flow[:, :2].clamp(min=0, max=1)
# True
print((x_torch.numpy()==x_flow.numpy()).all())
初步判断是 view 定制算错 stride 的原因,关闭 view 再执行 ONEFLOW_DISABLE_VIEW=1 python test.py 结果是正确的。
@small1945 oneflow-inc/oneteam#472 可以在这个 issue 里了解一下 view 机制是什么,会有利于你查这个问题。
初步判断是 view 定制算错 stride 的原因,关闭 view 再执行
ONEFLOW_DISABLE_VIEW=1 python test.py结果是正确的。@small1945 Oneflow-Inc/OneTeam#472 可以在这个 issue 里了解一下 view 机制是什么,会有利于你查这个问题。
收到
x_torch = torch.arange(25).reshape(5, 5)
y_torch = torch.ones(5).reshape(1, 5).long()
print(y_torch)
x_flow = flow.tensor(x_torch.numpy())
y_flow = flow.tensor(y_torch.numpy())
x_torch[:1, :] -=y_torch
x_flow[:1, : ]-= y_flow
# True
张量在第一维做切片的时候是没问题的
x_torch = torch.arange(25).reshape(5, 5)
y_torch = torch.ones(5).reshape(1, 5).long()
print(y_torch)
x_flow = flow.tensor(x_torch.numpy())
y_flow = flow.tensor(y_torch.numpy())
x_torch[:, :1] -=y_torch
x_flow[:, :1 ]-= y_flow.contiguous()
# False
张量切片后inplace加上经过contiguous操作的张量,仍然出错
根据以上测试推测slice算子和slice_update算子暂时不存在问题
x_torch = torch.arange(25).reshape(5, 5)
y_torch = torch.ones(5).reshape(5, 1).long()
x_flow = flow.tensor(x_torch.numpy())
y_flow = flow.tensor(y_torch.numpy())
x_torch[:, 4:] += y_torch
x_flow[:, 4:]+= y_flow.contiguous()
flow.add(x_flow[:, 4:], y_flow.contiguous(), inplace=True)
print(x_torch)
print(x_flow)
print((x_torch.numpy() == x_flow.numpy()).all())
#False
-
直接使用
flow.add(x_flow[:, 4:], y_flow.contiguous(), inplace=True)得出的结果仍然不一致,排除slice_update算子的问题 -
经排查,张量经过add inplace操作后结果就有问题,因此原因是add算子不支持非contiguous的张量操作。
-
除了add算子外,包括mut,sub等基本算子也存在不支持非contiguous的张量的问题
#8867 中已经解决