oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

Aborted (core dumped) in `flow.nn.functional.affine_grid/flow.nn.functional.dropout`

Open x0w3n opened this issue 1 year ago • 0 comments

Summary

A crash is triggered when the processed tensor shapes do not match.

Code to reproduce bug

import oneflow as flow

theta = flow.tensor([[2.0, 0.0, 4.0], [0.0, 2.0, 5.0]],  dtype=flow.float)
size = [3, 3, 3]
align_corners = True
grid = flow.nn.functional.affine_grid(theta=theta, size=size, align_corners=align_corners)

output:

F20241205 09:03:12.730742 2436861 shape.cpp:30] Check failed: index < tp()->NumAxes() (2 vs. 2)  Shape: (2,3) visit index: 2 > num_axes: 2
*** Check failure stack trace: ***
    @     0x7f57ec3d09ca  google::LogMessage::Fail()
    @     0x7f57ec3d0cb2  google::LogMessage::SendToLog()
    @     0x7f57ec3d0537  google::LogMessage::Flush()
    @     0x7f57ec3d30a9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f57e1ccba57  oneflow::ConstShapeMixIn<>::At()
    @     0x7f57e8440835  oneflow::AffineGridOp::InferPhysicalTensorDesc()
    @     0x7f58cb7058cc  std::_Function_handler<>::_M_invoke()
    @     0x7f57e525e799  oneflow::one::UserOpExpr::InferPhysicalTensorDesc()
    @     0x7f57e520d3a1  oneflow::one::LocalTensorInferCache::Infer()
    @     0x7f57e520f5bb  oneflow::one::LocalTensorInferCache::GetOrInfer()
    @     0x7f57e527ecaf  oneflow::one::NaiveInterpret()
    @     0x7f57e52833f5  oneflow::one::EagerLocalInterpreter::ApplyImpl()
    @     0x7f57e52b79c7  oneflow::one::EagerInterpreter::Apply()
    @     0x7f57e52b857b  oneflow::one::AutogradInterpreter::Apply()
    @     0x7f57e52bb708  oneflow::one::OpInterpUtil::Dispatch()
    @     0x7f57e52bda48  oneflow::one::OpInterpUtil::Dispatch<>()
    @     0x7f57e52bdd1e  oneflow::one::OpInterpUtil::Dispatch<>()
    @     0x7f58cb9ea85a  oneflow::one::OpInterpUtil::Dispatch<>()
    @     0x7f57e57a65a5  _ZNSt17_Function_handlerIFN7oneflow5MaybeINS0_3one6TensorEvEERKSt10shared_ptrIS3_ERKNS0_5ShapeERKbEZNS2_10functional18PackedFunctorMakerISE_E4makeINSF_4impl17AffineGridFunctorELi0EEENSF_13PackedFunctorISE_EERKSsRKT_EUlS8_SB_SD_E_E9_M_invokeERKSt9_Any_dataS8_SB_SD_
    @     0x7f57e887d0dd  oneflow::one::functional::AffineGrid()
    @     0x7f58cb894e99  oneflow::one::functional::affine_grid()
    @           0x507397  cfunction_call
    @           0x4f065c  _PyObject_MakeTpCall
    @           0x4eccff  _PyEval_EvalFrameDefault
    @           0x4e69da  _PyEval_EvalCode
    @           0x4f7de4  _PyFunction_Vectorcall
    @           0x4e8b15  _PyEval_EvalFrameDefault
    @           0x4e69da  _PyEval_EvalCode
    @           0x4e6667  _PyEval_EvalCodeWithName
    @           0x4e6619  PyEval_EvalCodeEx
    @           0x5938eb  PyEval_EvalCode
    @           0x5c1157  run_eval_code_obj
Aborted (core dumped)
import oneflow as flow

x = flow.tensor([[[[1, 2, 3]], [[4, 5, 6]]]], dtype=flow.float32)
addend = flow.tensor([[[1, 2]]], dtype=flow.float32)
p = 10
y = flow.nn.functional.dropout(x, p=p, training=True, addend=addend)

output:

F20241205 09:05:44.117789 2438107 dropout_kernel.cpp:83] Check failed: add_to_output->shape_view() == out->shape_view() ((1,1,2) vs. (1,2,1,3)) 
*** Check failure stack trace: ***
    @     0x7faf9ddd09ca  google::LogMessage::Fail()
    @     0x7faf9ddd0cb2  google::LogMessage::SendToLog()
    @     0x7faf9ddd0537  google::LogMessage::Flush()
    @     0x7faf9ddd30a9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7faf985088e8  oneflow::(anonymous namespace)::DropoutKernelCPU<>::Compute()
    @     0x7faf99b4e536  oneflow::one::StatefulOpKernel::Compute()
    @     0x7faf97de8cab  oneflow::vm::OpCallInstructionUtil::Compute()
    @     0x7faf97de6787  oneflow::vm::OpCallInstructionPolicy::Compute()
    @     0x7faf97de25bc  oneflow::vm::Instruction::Compute()
    @     0x7faf97de0a6f  oneflow::vm::EpStreamPolicyBase::Run()
    @     0x7faf97dec086  oneflow::vm::StreamPolicy::RunIf()
    @     0x7faf97df36de  oneflow::vm::ThreadCtx::TryReceiveAndRun()
    @     0x7faf97df5d2d  oneflow::(anonymous namespace)::WorkerLoop()
    @     0x7faf97df611f  _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJPFvPN7oneflow2vm9ThreadCtxERKSt8functionIFvS6_EEES6_ZNS3_14VirtualMachine15CreateThreadCtxENS3_6SymbolINS3_6DeviceEEENS3_10StreamTypeEmEUlS6_E3_EEEEE6_M_runEv
    @     0x7faf9dde540f  execute_native_thread_routine
    @     0x7fb08587cb43  (unknown)
    @     0x7fb08590ea00  (unknown)
Aborted (core dumped)

System Information

  • What is your OneFlow installation (pip, source, dockerhub): pip
  • OS: Ubuntu 22.04.3 LTS
  • OneFlow version (run python3 -m oneflow --doctor):
path: ['/home/miniconda3/envs/oneflow/lib/python3.9/site-packages/oneflow']
version: 0.9.0
git_commit: 381b12c
cmake_build_type: Release
rdma: True
mlir: True
  • Python version: 3.9.13
  • CUDA driver version: 12.2
  • GPU models: NVIDIA GeForce RTX 4090
  • Other info: None

x0w3n avatar Dec 05 '24 01:12 x0w3n