bonito
bonito copied to clipboard
Ampere GPU RuntimeError Cuda and slow basecalling speed
Hi,
I'm trying to run a custom mod base model with bonito with the following line:
onito basecaller --modified-base-model train_results/model_best.pt [email protected] BC3/ --device cuda:0 --reference /home/v313/ref-seqs/lambdagenomeref.fasta > ahyC_bonito_basecalls.bam
I get the following error after which bonito starts basecalling with CPU:
> reading pod5
> outputting aligned bam
> loading model [email protected]
> loading modified base model
> loaded modified base model to call (alt to C): a=5ahyC
> loading reference
> calling: 0%| | 0/40000 [00:00<?, ? reads/s]Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/v313/.local/lib/python3.8/site-packages/bonito/multiprocessing.py", line 110, in run
for item in self.iterator:
File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 71, in <genexpr>
(read, compute_scores(model, batch, reverse=reverse)) for read, batch in batches
File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 34, in compute_scores
scores = model(batch.to(dtype).to(device))
File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/model.py", line 178, in forward
return self.encoder(x)
File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 117, in forward
layer(buff1, buff2, self.chunks)
File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 85, in forward
void_ptr(input_buffer.data @ self.w_ih),
RuntimeError: CUDA out of memory. Tried to allocate 2.20 GiB (GPU 0; 7.79 GiB total capacity; 1.68 GiB already allocated; 1.81 GiB free; 4.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
> calling: 0%| | 1/40000 [00:04<53:1
I'm trying to run this on 1.12.1+cu113 torch version for ampere gpus. I'm running a mobile 3080.
Name: torch
Version: 1.12.1+cu113
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /home/v313/.local/lib/python3.8/site-packages
Requires: typing-extensions
Required-by: ont-bonito, ont-remora, thop, torchaudio, torchvision
Switching from @4.0.0 model to @3.5.2 dismisses the error. My question is, is the error related to the slow basecalling speed and should I expect greater speed while using an ampere gpu? With the 3.5.2 model I'm getting 3 reads/s which seems extremely slow (and this is with the fast model aswell)
Thanks in advance!