ccsmeth
ccsmeth copied to clipboard
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
Numpy那个应该是没有问题了,这次走到21% 了哈哈 但是出现了cuDNN的error, 这个好像是pytorch的报错,我确实不知道咋搞 ↓我的code
#!/bin/bash
#SBATCH --mail-type=END,FAIL
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=2
#SBATCH --time=02:00:00
#SBATCH --mem=48G
#SBATCH --gres=gpu:a100:1
#SBATCH -o %A_%a_output.txt
#SBATCH -e %A_%a_error.txt
CUDA_VISIBLE_DEVICES=0 ccsmeth call_mods \
--input 121A/mapped.bam \
--ref 121A/assembly.rotated.polished.renamed.fsa \
--model_file /ccsmeth/models/model_ccsmeth_5mCpG_call_mods_attbigru2s_b21.v2.ckpt \
--output output.hifi.pbmm2.call_mods \
--threads 10 --threads_call 2 --model_type attbigru2s \
--rm_per_readsite --mode align
↓ error.txt
batch_reader: 21%|██ | 1941/9340 [03:23<24:18, 5.07it/s]
batch_reader: 21%|██ | 1944/9340 [03:24<28:32, 4.32it/s]
batch_reader: 21%|██ | 1949/9340 [03:25<27:37, 4.46it/s]
batch_reader: 21%|██ | 1953/9340 [03:26<28:56, 4.25it/s]
batch_reader: 21%|██ | 1957/9340 [03:27<29:52, 4.12it/s]
batch_reader: 21%|██ | 1962/9340 [03:28<28:35, 4.30it/s]
batch_reader: 21%|██ | 1968/9340 [03:29<26:08, 4.70it/s]
batch_reader: 21%|██ | 1973/9340 [03:30<26:06, 4.70it/s]
batch_reader: 21%|██ | 1979/9340 [03:31<24:40, 4.97it/s]
batch_reader: 21%|██▏ | 1985/9340 [03:33<23:47, 5.15it/s]
batch_reader: 21%|██▏ | 1990/9340 [03:34<24:27, 5.01it/s]
batch_reader: 21%|██▏ | 1996/9340 [03:35<23:36, 5.18it/s]
batch_reader: 21%|██▏ | 2001/9340 [03:36<24:16, 5.04it/s]Process Process-6:
Process Process-4:
Traceback (most recent call last):
Traceback (most recent call last):
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 340, in _call_mods_q
pred_str, accuracy, batch_num = _call_mods2s(features_batch, model, args.batch_size, device)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 246, in _call_mods2s
voutputs, vlogits = model(FloatTensor(b_fkmers, device), FloatTensor(b_fpasss, device),
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 340, in _call_mods_q
pred_str, accuracy, batch_num = _call_mods2s(features_batch, model, args.batch_size, device)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/models.py", line 118, in forward
out1, n_states1 = self.rnn(out1, self.init_hidden(out1.size(0),
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 246, in _call_mods2s
voutputs, vlogits = model(FloatTensor(b_fkmers, device), FloatTensor(b_fpasss, device),
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 942, in forward
result = _VF.gru(input, hx, self._flat_weights, self.bias, self.num_layers,
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/models.py", line 118, in forward
out1, n_states1 = self.rnn(out1, self.init_hidden(out1.size(0),
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 942, in forward
result = _VF.gru(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
o(╥﹏╥)o
跑这个CUDA_VISIBLE_DEVICES=0 ccsmeth call_mods需要多久啊 我设置的10个线程为什么把我的cpu直接占满了呢
跑这个CUDA_VISIBLE_DEVICES=0 ccsmeth call_mods需要多久啊 我设置的10个线程为什么把我的cpu直接占满了呢
我觉得优化有问题,机器学习优化应该是没搞好