tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[CI Problem] TVMError: Check failed: ret == 0 (-1 vs. 0)

Open chlorane opened this issue 2 years ago • 1 comments
trafficstars

I'm now using some script in Esp-net (https://github.com/espnet/espnet) based on tvm. I'm using virtual environment in conda with Python 3.7 and Cuda 10.0 However, while running the script, I received the following error: [compute04] 2023-02-10 13:31:53,896 (abs_task:1527) INFO: [train] Batch sampler: LengthBatchSampler(N-batch=1, batch_bins=60000000, sort_in_batch=descending, sort_batch=descending) [compute04] 2023-02-10 13:31:53,897 (abs_task:1529) INFO: [train] mini-batch sizes summary: N-batch=1, mean=14.0, min=14, max=14 Traceback (most recent call last): File "/home/nas02home/conda/envs/espnet/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/nas02home/conda/envs/espnet/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/nas02/homes/espnet/espnet2/bin/asr_train.py", line 23, in main() File "/nas02/homes/espnet/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/nas02/homes/espnet/espnet2/tasks/abs_task.py", line 1019, in main cls.main_worker(args) File "/nas02/homes/espnet/espnet2/tasks/abs_task.py", line 1323, in main_worker distributed_option=distributed_option, File "/nas02/homes/espnet/espnet2/train/trainer.py", line 291, in run distributed_option=distributed_option, File "/nas02/homes/espnet/espnet2/train/trainer.py", line 556, in train_one_epoch retval = model(**batch) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/nas02/homes/espnet/espnet2/asr/espnet_model.py", line 202, in forward encoder_out, encoder_out_lens = self.encode(speech, speech_lengths) File "/nas02/homes/espnet/espnet2/asr/espnet_model.py", line 352, in encode encoder_out, encoder_out_lens, _ = self.encoder(feats, feats_lengths) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/nas02/homes/espnet/espnet2/asr/encoder/longformer_encoder.py", line 336, in forward xs_pad, masks = self.encoders(xs_pad, masks) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/nas02/homes/espnet/espnet/nets/pytorch_backend/transformer/repeat.py", line 18, in forward args = m(*args) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/nas02/homes/espnet/espnet/nets/pytorch_backend/conformer/encoder_layer.py", line 141, in forward x_att = self.self_attn(x_q, x, x, mask) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/nas02/homes/espnet/espnet/nets/pytorch_backend/transformer/longformer_attention.py", line 53, in forward output_attentions=True, File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/longformer/longformer.py", line 168, in forward d_mask = diagonaled_mm_tvm(ones, float_mask, self.attention_window, self.attention_dilation, False, 0, False) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/longformer/diagonaled_mm_tvm.py", line 261, in forward output = DiagonaledMM._diagonaled_mm(t1, t2, w, d, is_t1_diagonaled=is_t1_diagonaled, padding=padding, autoregressive=autoregressive) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/longformer/diagonaled_mm_tvm.py", line 202, in _diagonaled_mm _diagonaled_mm_function(t1, t2, r, d, w, w_upper, padding, transpose_t1, m if is_t1_diagonaled else c) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/tvm/contrib/dlpack.py", line 40, in _wrapper return tvm_func(args) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/tvm/_ffi/function.py", line 143, in call return self._entry(args) File "/home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/tvm/_ffi/_ctypes/function.py", line 210, in call raise get_last_ffi_error() tvm._ffi.base.TVMError: Traceback (most recent call last): [bt] (1) /nas02/homes/conda/envs/espnet/lib/python3.7/site-packages/tvm/libtvm_runtime.so(TVMFuncCall+0x61) [0x7f057b610681] [bt] (0) /nas02/homes/conda/envs/espnet/lib/python3.7/site-packages/tvm/libtvm_runtime.so(+0x5cfdc) [0x7f057b632fdc] File "/code/tvm/src/runtime/module_util.cc", line 73 TVMError: Check failed: ret == 0 (-1 vs. 0) : Assert fail: ((((1 == int32(arg2.strides[3])) && (t3d3 == int32(arg2.strides[2]))) && ((t3d3h) == int32(arg2.strides[1]))) && (((t3d3h)*n) == int32(arg2.strides[0]))), arg2.strides: expected to be compact array Loading tvm binary from: /home/nas02home/conda/envs/espnet/lib/python3.7/site-packages/longformer/../longformer/lib/lib_diagonaled_mm_float32_cuda.so

chlorane avatar Feb 10 '23 04:02 chlorane

hi, any solutions?I have the same problem.

wangweiwei1188 avatar Jan 24 '24 09:01 wangweiwei1188