oneflow
oneflow copied to clipboard
[bug] MaxUnpool2d with invalid indices (-1) crashes process instead of raising Python error
Summary
When nn.MaxUnpool2d is called with invalid indices (e.g., all values set to -1), OneFlow does not raise a Python error. Instead, the process aborts due to a C++ CHECKfailure.
Code to reproduce bug
import oneflow as flow
import oneflow.nn as nn
import numpy as np
device = "cpu"
flow.manual_seed(0)
np.random.seed(0)
class M(nn.Module):
def __init__(self):
super().__init__()
self.unpool = nn.MaxUnpool2d(kernel_size=2)
def forward(self, x, indices):
return self.unpool(x, indices)
def main():
m = M().to(device)
x = flow.tensor(np.random.rand(1, 1, 2, 2), dtype=flow.float32, device=device)
# Invalid indices: all set to -1
bad_idx = flow.full_like(x.to(flow.int64), -1)
print("about to call unpool with invalid indices = -1")
y = m(x, bad_idx)
# Force sync to trigger backend error
print("forcing sync via .numpy() ...")
_ = y.numpy()
if __name__ == "__main__":
main()
Output
forcing sync via .numpy() ...
terminate called after throwing an instance of 'oneflow::Exception'
what(): Check failed: (idx >= 0 && idx < out_elem_num) Found an invalid max index: -1, output volumes are of size 16
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in vm::ThreadCtx::TryReceiveAndRun()
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in vm::Instruction::Compute()
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in vm::OpCallInstructionUtil::Compute(vm::OpCallInstructionPolicy*, vm::Stream*, bool, bool)
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in StatefulOpKernel::Compute(eager::CallContext*, ep::Stream*, user_op::OpKernel const*, user_op::OpKernelState*, user_op::OpKernelCache const*) const
File "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", line <unknown>, in MaxUnpoolNdKernel<(DeviceType)1, float>::Compute(user_op::KernelComputeContext*) const
File "oneflow/user/kernels/max_unpool_kernel.cpp", line 32, in MaxUnpoolNdForwardOrBackward
CHECK_OR_THROW(idx >= 0 && idx < out_elem_num)
Error Type: oneflow.ErrorProto.check_failed_error
Stack trace (most recent call last) in thread 1921972:
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803359c17, in
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a80335942c, in
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803354ca8, in vm::ThreadCtx::TryReceiveAndRun()
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a8032f7394, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a8032fa777, in vm::Instruction::Compute()
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a8032ffb38, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a8032ff2ec, in
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a80330419f, in vm::OpCallInstructionUtil::Compute(vm::OpCallInstructionPolicy*, vm::Stream*, bool, bool)
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803fa2ca9, in StatefulOpKernel::Compute(eager::CallContext*, ep::Stream*, user_op::OpKernel const*, user_op::OpKernelState*, user_op::OpKernelCache const*) const
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803aa1adb, in MaxUnpoolNdKernel<(DeviceType)1, float>::Compute(user_op::KernelComputeContext*) const
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803a98637, in
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a803a9541b, in
Object "<pytorch_source>/oneflow/lib/python3.10/site-packages/oneflow/../oneflow.libs/liboneflow-b64d744b.so", at 0x75a7ffaa277d, in
Aborted (Signal sent by tkill() 1921507 1002)
Aborted (core dumped)
System Information
- OS: Ubuntu 22.04.4 LTS (x86_64)
- OneFlow version : 1.0.0.dev20250921+cpu
- Python version: 3.10.16