TOOD
TOOD copied to clipboard
Error during training (Assertion input_val >= zero && input_val <= one failed.)
Problem
thank you for contribution, I encountered gradient exploding during training the model tood_r50_fpn_1x_coco.
-
I tried to train this model in Mix-Precision Training strategy, and the loss scale was set 'dynamic'. The training soon stopped, and raise RuntimeError: CUDA error: device-side assert triggered.
-
I also retrained the model with FP32 precision, but it did not work.
-
A lower lr did not address gradient exploding.
-
Gradient cutting helps avoid training failure (Mix-Precision Training, loss scale=512.) , but the model can not converge.
I try to google this issue. I think it is not OOM. It seems to relate with the NaN value in prediction head and further cause the error at calculating loss. I do not know if the environment(mmdet-1.15.0) affects with training.
My modification
- I port the TOOD code to my working environment (MMDet-1.15.0), without edit.
- I edit the training config to train my own dataset.
Environment
2021-12-09 16:50:01,643 - mmdet - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 2070
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.4.r11.4/compiler.30033411_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2021.3-Product Build 20210617 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.0.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.10.0
OpenCV: 4.5.3
MMCV: 1.3.10
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.15.0+87eda06
------------------------------------------------------------
Error Report
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [32,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [33,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [34,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [35,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [36,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [37,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [38,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [39,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [40,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [41,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [42,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [43,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [44,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [45,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [46,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [47,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [48,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [49,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [50,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [51,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [52,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [53,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [54,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [55,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [56,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [57,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [58,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [59,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [60,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [61,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [62,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [63,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [32,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [33,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [34,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [35,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [36,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [37,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [38,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [39,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [40,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [41,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [42,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [43,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [44,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [45,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [46,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [47,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [48,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [49,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [50,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [51,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [52,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [53,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [54,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [55,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [56,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [57,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [58,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [59,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [60,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [61,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [62,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [63,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
Traceback (most recent call last):
File "tools/train.py", line 188, in <module>
main()
File "tools/train.py", line 184, in main
meta=meta)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 97, in new_func
return old_func(*args, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/base.py", line 171, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/single_stage.py", line 83, in forward_train
gt_labels, gt_bboxes_ignore)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/base_dense_head.py", line 54, in forward_train
losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 185, in new_func
return old_func(*args, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/tood_head.py", line 426, in loss
num_total_samples=num_total_samples)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/core/utils/misc.py", line 29, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/tood_head.py", line 333, in loss_single
& (labels < bg_class_ind)).nonzero().squeeze(1)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1623448265233/work/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f12c21efa22 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x10ac3 (0x7f12c2451ac3 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f12c2453167 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f12c21d95a4 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #4: <unknown function> + 0xa2bb12 (0x7f133bad0b12 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xa2bbb1 (0x7f133bad0bb1 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #24: __libc_start_main + 0xe7 (0x7f1376d75bf7 in /lib/x86_64-linux-gnu/libc.so.6)
Aborted
Same issue
Same issue. My mmdet's version is 2.19.0 and raise error during training the 3rd epoch
You can try to clamp the value of the box area when computing GIoU loss, e.g., https://github.com/fcjian/TOOD/blob/93b3a87556e361f7d56507bd56943cf121c3caa2/mmdet/core/bbox/iou_calculators/iou2d_calculator.py#L212-L215
You can try to clamp the value of the box area when computing GIoU loss, e.g.,
https://github.com/fcjian/TOOD/blob/93b3a87556e361f7d56507bd56943cf121c3caa2/mmdet/core/bbox/iou_calculators/iou2d_calculator.py#L212-L215
hello sir,i have clamp the value of box area as you show ,but still crash at the 5rd epoch. My mmdet's version is 2.14.0+d3e713d.
Error Report:
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [26,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [27,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [28,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [29,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [30,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [508,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [32,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [33,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [34,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [35,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [36,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [37,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [38,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [39,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [40,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [41,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [42,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [43,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [44,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [45,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [46,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [47,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [48,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [49,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [50,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [51,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [52,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [53,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [54,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [55,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [56,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [57,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [58,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [59,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [60,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [61,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [62,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [238,0,0], thread: [63,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [0,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [1,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [2,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [3,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [4,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [5,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [6,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [7,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [8,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [9,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [10,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [11,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [12,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [13,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [14,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [15,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [16,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [17,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [18,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [19,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [20,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [21,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [22,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [23,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [24,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [25,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [26,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [27,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [28,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [29,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [30,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [328,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [32,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [33,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [34,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [35,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [36,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [37,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [38,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [39,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [40,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [41,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [42,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [43,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [44,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [45,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [46,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [47,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [48,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [49,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [50,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [51,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [52,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [53,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [54,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [55,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [56,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [57,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [58,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [59,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [60,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [61,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [62,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [418,0,0], thread: [63,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [0,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [1,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [2,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [3,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [4,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [5,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [6,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [7,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [8,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [9,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [10,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [11,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [12,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [13,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [14,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [15,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [16,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [17,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [18,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [19,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [20,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [21,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [22,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [23,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [24,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [25,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [26,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [27,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [28,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [29,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [30,0,0] Assertion input_val >= zero && input_val <= one
failed.
/opt/conda/conda-bld/pytorch_1614378098133/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [568,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one
failed.
Traceback (most recent call last):
File "./tools/train.py", line 188, in
Killing subprocess 19911
Traceback (most recent call last):
File "/root/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/root/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
Thank you for your reply.
@fcjian Thanks for reply! It solves the CUDA error, but the model can not converge. During training, a problem similar with gradient cutting happened. The log shows a sudden increase of loss. After that, the loss fluctuates in a tiny range. I'll try again with the original TOOD code without transfering to higher mmdet version.
2021-12-29 09:32:52,217 - mmdet - INFO - Epoch [1][600/1162] lr: 2.000e-03, eta: 8:39:18, time: 0.544, data_time: 0.013, memory: 5142, loss_cls: 0.6940, loss_bbox: 1.2061, loss: 1.9001
2021-12-29 09:33:18,832 - mmdet - INFO - Epoch [1][650/1162] lr: 2.000e-03, eta: 8:38:09, time: 0.532, data_time: 0.013, memory: 5142, loss_cls: 0.6794, loss_bbox: 1.1886, loss: 1.8680
2021-12-29 09:33:45,535 - mmdet - INFO - Epoch [1][700/1162] lr: 2.000e-03, eta: 8:37:13, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 0.6674, loss_bbox: 1.0485, loss: 1.7159
2021-12-29 09:34:12,217 - mmdet - INFO - Epoch [1][750/1162] lr: 2.000e-03, eta: 8:36:19, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 0.6646, loss_bbox: 1.0119, loss: 1.6765
2021-12-29 09:34:38,781 - mmdet - INFO - Epoch [1][800/1162] lr: 2.000e-03, eta: 8:35:20, time: 0.531, data_time: 0.013, memory: 5142, loss_cls: 0.6487, loss_bbox: 0.9564, loss: 1.6051
2021-12-29 09:35:05,190 - mmdet - INFO - Epoch [1][850/1162] lr: 2.000e-03, eta: 8:34:14, time: 0.528, data_time: 0.013, memory: 5142, loss_cls: 0.6176, loss_bbox: 0.8406, loss: 1.4582
2021-12-29 09:35:31,799 - mmdet - INFO - Epoch [1][900/1162] lr: 2.000e-03, eta: 8:33:26, time: 0.532, data_time: 0.013, memory: 5142, loss_cls: 0.6210, loss_bbox: 0.9229, loss: 1.5439
2021-12-29 09:35:58,144 - mmdet - INFO - Epoch [1][950/1162] lr: 2.000e-03, eta: 8:32:24, time: 0.527, data_time: 0.013, memory: 5142, loss_cls: 1.1693, loss_bbox: 1.1850, loss: 2.3543
2021-12-29 09:36:25,339 - mmdet - INFO - Exp name: tood_r50_fpn_on_input_1x_coco_cloth.py
2021-12-29 09:36:25,340 - mmdet - INFO - Epoch [1][1000/1162] lr: 2.000e-03, eta: 8:32:14, time: 0.544, data_time: 0.013, memory: 5142, loss_cls: 1.2817, loss_bbox: 1.3174, loss: 2.5991
2021-12-29 09:36:52,114 - mmdet - INFO - Epoch [1][1050/1162] lr: 2.000e-03, eta: 8:31:39, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2358, loss_bbox: 1.2847, loss: 2.5205
2021-12-29 09:37:18,908 - mmdet - INFO - Epoch [1][1100/1162] lr: 2.000e-03, eta: 8:31:07, time: 0.536, data_time: 0.013, memory: 5142, loss_cls: 1.2365, loss_bbox: 1.3173, loss: 2.5538
2021-12-29 09:37:45,867 - mmdet - INFO - Epoch [1][1150/1162] lr: 2.000e-03, eta: 8:30:43, time: 0.539, data_time: 0.013, memory: 5142, loss_cls: 1.2022, loss_bbox: 1.2296, loss: 2.4319
2021-12-29 09:37:52,329 - mmdet - INFO - Saving checkpoint at 1 epochs
2021-12-29 09:38:47,804 - mmdet - INFO - Evaluating bbox...
2021-12-29 09:38:51,494 - mmdet - INFO - Exp name: tood_r50_fpn_on_input_1x_coco_cloth.py
2021-12-29 09:38:51,495 - mmdet - INFO - Epoch(val) [1][793] bbox_mAP: 0.0170, bbox_mAP_50: 0.0560, bbox_mAP_75: 0.0090, bbox_mAP_s: -1.0000, bbox_mAP_m: 0.0240, bbox_mAP_l: 0.0190, bbox_mAP_copypaste: 0.017 0.056 0.009 -1.000 0.024 0.019
2021-12-29 09:39:21,128 - mmdet - INFO - Epoch [2][50/1162] lr: 2.000e-03, eta: 8:27:14, time: 0.592, data_time: 0.062, memory: 5142, loss_cls: 1.2236, loss_bbox: 1.2423, loss: 2.4659
2021-12-29 09:39:47,839 - mmdet - INFO - Epoch [2][100/1162] lr: 2.000e-03, eta: 8:26:45, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2410, loss_bbox: 1.2517, loss: 2.4927
2021-12-29 09:40:14,530 - mmdet - INFO - Epoch [2][150/1162] lr: 2.000e-03, eta: 8:26:16, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2827, loss_bbox: 1.2900, loss: 2.5726
2021-12-29 09:40:41,392 - mmdet - INFO - Epoch [2][200/1162] lr: 2.000e-03, eta: 8:25:54, time: 0.537, data_time: 0.013, memory: 5142, loss_cls: 1.2351, loss_bbox: 1.2374, loss: 2.4725
2021-12-29 09:41:08,168 - mmdet - INFO - Epoch [2][250/1162] lr: 2.000e-03, eta: 8:25:28, time: 0.536, data_time: 0.013, memory: 5142, loss_cls: 1.1736, loss_bbox: 1.1955, loss: 2.3691
2021-12-29 09:41:34,806 - mmdet - INFO - Epoch [2][300/1162] lr: 2.000e-03, eta: 8:24:57, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2357, loss_bbox: 1.2372, loss: 2.4729
2021-12-29 09:42:01,528 - mmdet - INFO - Epoch [2][350/1162] lr: 2.000e-03, eta: 8:24:29, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2839, loss_bbox: 1.2587, loss: 2.5425
2021-12-29 09:42:28,154 - mmdet - INFO - Epoch [2][400/1162] lr: 2.000e-03, eta: 8:23:58, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2595, loss_bbox: 1.2359, loss: 2.4954
2021-12-29 09:42:54,986 - mmdet - INFO - Epoch [2][450/1162] lr: 2.000e-03, eta: 8:23:35, time: 0.537, data_time: 0.013, memory: 5142, loss_cls: 1.2725, loss_bbox: 1.3049, loss: 2.5773
2021-12-29 09:43:21,637 - mmdet - INFO - Epoch [2][500/1162] lr: 2.000e-03, eta: 8:23:05, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2867, loss_bbox: 1.2862, loss: 2.5730
2021-12-29 09:43:48,377 - mmdet - INFO - Epoch [2][550/1162] lr: 2.000e-03, eta: 8:22:38, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2554, loss_bbox: 1.2227, loss: 2.4781
2021-12-29 09:44:15,013 - mmdet - INFO - Epoch [2][600/1162] lr: 2.000e-03, eta: 8:22:08, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2519, loss_bbox: 1.2955, loss: 2.5474
2021-12-29 09:44:42,014 - mmdet - INFO - Epoch [2][650/1162] lr: 2.000e-03, eta: 8:21:49, time: 0.540, data_time: 0.013, memory: 5142, loss_cls: 1.2472, loss_bbox: 1.2727, loss: 2.5199
2021-12-29 09:45:08,675 - mmdet - INFO - Epoch [2][700/1162] lr: 2.000e-03, eta: 8:21:20, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.1740, loss_bbox: 1.2461, loss: 2.4200
2021-12-29 09:45:35,666 - mmdet - INFO - Epoch [2][750/1162] lr: 2.000e-03, eta: 8:21:00, time: 0.540, data_time: 0.013, memory: 5142, loss_cls: 1.2391, loss_bbox: 1.2960, loss: 2.5351
2021-12-29 09:46:02,395 - mmdet - INFO - Epoch [2][800/1162] lr: 2.000e-03, eta: 8:20:33, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2462, loss_bbox: 1.2470, loss: 2.4933
2021-12-29 09:46:29,543 - mmdet - INFO - Epoch [2][850/1162] lr: 2.000e-03, eta: 8:20:17, time: 0.543, data_time: 0.013, memory: 5142, loss_cls: 1.2525, loss_bbox: 1.3128, loss: 2.5653
2021-12-29 09:46:56,271 - mmdet - INFO - Epoch [2][900/1162] lr: 2.000e-03, eta: 8:19:50, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2501, loss_bbox: 1.2733, loss: 2.5234
2021-12-29 09:47:22,898 - mmdet - INFO - Epoch [2][950/1162] lr: 2.000e-03, eta: 8:19:19, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.3215, loss_bbox: 1.2575, loss: 2.5790
i meet the same issue , my code is "area1 = fp16_clamp((bboxes1[..., 2] - bboxes1[..., 0]), min=0) * fp16_clamp(( bboxes1[..., 3] - bboxes1[..., 1]), min=0) " since i clone the code, so i don't have to modify it. but the bug still happens. and it happens randomly each time when i train it.