bonito Ampere GPU RuntimeError Cuda and slow basecalling speed

Ampere GPU RuntimeError Cuda and slow basecalling speed

Open jorisbalc opened this issue 1 year ago • 0 comments

Hi,

I'm trying to run a custom mod base model with bonito with the following line:

onito basecaller --modified-base-model train_results/model_best.pt [email protected] BC3/ --device cuda:0  --reference /home/v313/ref-seqs/lambdagenomeref.fasta > ahyC_bonito_basecalls.bam

I get the following error after which bonito starts basecalling with CPU:

> reading pod5
> outputting aligned bam
> loading model [email protected]
> loading modified base model
> loaded modified base model to call (alt to C): a=5ahyC
> loading reference
> calling:   0%|                                                      | 0/40000 [00:00<?, ? reads/s]Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/multiprocessing.py", line 110, in run
    for item in self.iterator:
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 71, in <genexpr>
    (read, compute_scores(model, batch, reverse=reverse)) for read, batch in batches
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/basecall.py", line 34, in compute_scores
    scores = model(batch.to(dtype).to(device))
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/bonito/crf/model.py", line 178, in forward
    return self.encoder(x)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 117, in forward
    layer(buff1, buff2, self.chunks)
  File "/home/v313/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/v313/.local/lib/python3.8/site-packages/koi/lstm.py", line 85, in forward
    void_ptr(input_buffer.data @ self.w_ih),
RuntimeError: CUDA out of memory. Tried to allocate 2.20 GiB (GPU 0; 7.79 GiB total capacity; 1.68 GiB already allocated; 1.81 GiB free; 4.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
> calling:   0%|                                           | 1/40000 [00:04<53:1

I'm trying to run this on 1.12.1+cu113 torch version for ampere gpus. I'm running a mobile 3080.

Name: torch
Version: 1.12.1+cu113
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /home/v313/.local/lib/python3.8/site-packages
Requires: typing-extensions
Required-by: ont-bonito, ont-remora, thop, torchaudio, torchvision

Switching from @4.0.0 model to @3.5.2 dismisses the error. My question is, is the error related to the slow basecalling speed and should I expect greater speed while using an ampere gpu? With the 3.5.2 model I'm getting 3 reads/s which seems extremely slow (and this is with the fast model aswell)

Thanks in advance!

Mar 20 '23 13:03 jorisbalc

bonito bonito copied to clipboard

Ampere GPU RuntimeError Cuda and slow basecalling speed

bonito
bonito copied to clipboard