torch.OutOfMemoryError: CUDA out of memory.
Hello, thanks for maintaining this useful project!
I didn't experience this issue when I downloaded and used the project previously, but, now I'm trying to download and use it again, and it's not working as expected.
Here is the command what I ran.
medaka_consensus -i ./cluster_001/4_reads.fastq -d ./cluster_001/7_final_consensus.fasta -o ./medaka_25559_0423 -t 12&&cp medaka_25559_0423/consensus.fasta ./cluster_001/8_medaka.fasta &&rm -r medaka_25559_0423
Logging
WARNING: Failed to detect a model version, will use default: 'r1041_e82_400bps_sup_v5.0.0'
Checking program versions
This is medaka 2.0.1
Program Version Required Pass
bcftools 1.13 1.11 True
bgzip 1.13+ds 1.11 True
minimap2 2.24 2.11 True
samtools 1.12 1.11 True
tabix 1.13+ds 1.11 True
[19:46:50 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmpwgn7f1xk.
[19:46:52 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmp1ah_wcqq.
Aligning basecalls to draft
Creating fai index file /home/star/bioinfo/25559/trycycler_25559/cluster_001/7_final_consensus.fasta.fai
Creating mmi index file /home/star/bioinfo/25559/trycycler_25559/cluster_001/7_final_consensus.fasta.map-ont.mmi
[M::mm_idx_gen::0.165*1.01] collected minimizers
[M::mm_idx_gen::0.208*1.41] sorted minimizers
[M::main::0.259*1.33] loaded/built the index for 1 target sequence(s)
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.271*1.31] distinct minimizers: 846770 (97.94% are singletons); average occurrences: 1.030; average spacing: 5.349; total length: 4667057
[M::main] Version: 2.24-r1122
[M::main] CMD: minimap2 -I 16G -x map-ont -d /home/star/bioinfo/25559/trycycler_25559/cluster_001/7_final_consensus.fasta.map-ont.mmi /home/star/bioinfo/25559/trycycler_25559/cluster_001/7_final_consensus.fasta
[M::main] Real time: 0.277 sec; CPU: 0.362 sec; Peak RSS: 0.045 GB
[M::main::0.072*1.02] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.087*1.01] mid_occ = 10
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.097*1.01] distinct minimizers: 846770 (97.94% are singletons); average occurrences: 1.030; average spacing: 5.349; total length: 4667057
[M::worker_pipeline::22.304*9.73] mapped 37907 sequences
[M::worker_pipeline::23.027*9.46] mapped 5082 sequences
[M::main] Version: 2.24-r1122
[M::main] CMD: minimap2 -x map-ont --secondary=no -L --MD -A 2 -B 4 -O 4,24 -E 2,1 -t 12 -a /home/star/bioinfo/25559/trycycler_25559/cluster_001/7_final_consensus.fasta.map-ont.mmi /home/star/bioinfo/25559/trycycler_25559/cluster_001/4_reads.fastq
[M::main] Real time: 23.097 sec; CPU: 217.831 sec; Peak RSS: 4.018 GB
[bam_sort_core] merging from 0 files and 12 in-memory blocks...
Running medaka consensus
[19:47:24 - Predict] Processing region(s): cluster_001_consensus:0-4667057
[19:47:24 - Predict] Using model: /home/star/medaka/lib/python3.10/site-packages/medaka/data/r1041_e82_400bps_sup_v5.0.0_model_pt.tar.gz.
[19:47:24 - Predict] Using minimum mapQ threshold of 1 for read filtering.
[19:47:24 - Predict] Found a GPU.
[19:47:24 - MdlStrTGZ] Model GRUModel(
(gru): GRU(10, 128, num_layers=2, batch_first=True, bidirectional=True)
(linear): Linear(in_features=256, out_features=5, bias=True)
)
[19:47:24 - MdlStrTGZ] loading weights from /tmp/tmpxtvfd8rl/model/weights.pt
[19:47:24 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmpxtvfd8rl.
[19:47:24 - Predict] Model device: cuda:0
[19:47:24 - Predict] Running prediction at half precision
[19:47:24 - BAMFile] Creating pool of 16 BAM file sets.
[19:47:24 - Predict] Processing 5 long region(s) with batching.
[19:47:24 - Sampler] Initializing sampler for consensus of region cluster_001_consensus:0-1000000.
[19:47:24 - Sampler] Initializing sampler for consensus of region cluster_001_consensus:999000-1999000.
[19:47:24 - PWorker] Running inference for 4.7M draft bases.
[19:47:29 - Feature] Processed cluster_001_consensus:0.0-999999.1 (median depth 120.0)
[19:47:29 - Sampler] Took 5.05s to make features.
[19:47:29 - Sampler] Initializing sampler for consensus of region cluster_001_consensus:1998000-2998000.
[19:47:29 - Feature] Processed cluster_001_consensus:999000.0-1998999.0 (median depth 119.0)
[19:47:29 - Sampler] Took 5.11s to make features.
[19:47:29 - Sampler] Initializing sampler for consensus of region cluster_001_consensus:2997000-3997000.
Traceback (most recent call last):
File "/home/star/medaka/bin/medaka", line 8, in <module>
sys.exit(main())
File "/home/star/medaka/lib/python3.10/site-packages/medaka/medaka.py", line 836, in main
args.func(args)
File "/home/star/medaka/lib/python3.10/site-packages/medaka/prediction.py", line 171, in predict
remainder_regions_depth = run_prediction(
File "/home/star/medaka/lib/python3.10/site-packages/medaka/prediction.py", line 46, in run_prediction
class_probs = model.predict_on_batch(x_data)
File "/home/star/medaka/lib/python3.10/site-packages/medaka/models.py", line 301, in predict_on_batch
x = self.forward(x).detach().cpu()
File "/home/star/medaka/lib/python3.10/site-packages/medaka/models.py", line 337, in forward
x = self.gru(x)[0]
File "/home/star/medaka/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/star/medaka/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/star/medaka/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 1393, in forward
result = _VF.gru(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 6.27 GiB. GPU 0 has a total capacity of 3.94 GiB of which 3.09 GiB is free. Including non-PyTorch memory, this process has 606.00 MiB memory in use. Of the allocated memory 528.32 MiB is allocated by PyTorch, and 23.68 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Failed to run medaka consensus.
This coming
Environment (if you do not have a GPU, write No GPU):
Installation method [from python - virtual ]
OS: [Ubuntu 22.04.5]
medaka version (2.0.1)
GPU model
*-display description: VGA compatible controller product: ASPEED Graphics Family [1A03:2000] vendor: ASPEED Technology, Inc. [1A03] physical id: 0 bus info: pci@0000:08:00.0 logical name: /dev/fb1 version: 30 width: 32 bits clock: 33MHz capabilities: vga_controller cap_list fb configuration: depth=32 driver=ast latency=0 resolution=1024,768 resources: irq:16 memory:c4000000-c5ffffff memory:c6000000-c601ffff ioport:4000(size=128) *-display description: VGA compatible controller product: GP107 [GeForce GTX 1050 Ti] [10DE:1C82] vendor: NVIDIA Corporation [10DE] physical id: 0 bus info: pci@0000:81:00.0 logical name: /dev/fb0 version: a1 width: 64 bits clock: 33MHz capabilities: vga_controller bus_master cap_list rom fb configuration: depth=32 driver=nvidia latency=0 resolution=800,600 resources: irq:81 memory:fa000000-faffffff memory:e0000000-efffffff memory:f0000000-f1ffffff ioport:f000(size=128) memory:c0000-dffff
nvidia-smi
NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2
cuDNN version
but, I used CPU only using '''pip install medaka-cpu --extra-index-url https://download.pytorch.org/whl/cpu'''
Let me know if you need any more information. I'd happy to help troubleshoot further. Thank you in advance for your time and support!
You can force medaka to run on CPU-only by setting CUDA_VISIBLE_DEVICES:
CUDA_VISIBLE_DEVICES="" medaka_consensus ...
If you want to run on the GPU, you can reduce the memory footprint by reducing the batch size (medaka_consensus -b option) from the default value of 100 to a lower value.
A secondary issue is that the CPU-only installation should not be fetching a cuda-capable version of pytorch. Was the installation done in a clean venv without a pre-existing install of pytorch? Please post the results of
pip show medaka-cpu
pip show torch
Thank you for kindness. I tried
CUDA_VISIBLE_DEVICES="" medaka_consensus ...
and check to show medaka-cpu and torch
(medaka) star@star-Z10PE-D16-WS:~/bioinfo/25559/trycycler_25559$ pip show medaka-cpu Name: medaka-cpu Version: 2.0.1 Summary: Neural network sequence error correction. Home-page: https://github.com/nanoporetech/medaka Author: ont-research Author-email: License: Location: /home/star/medaka/lib/python3.10/site-packages Requires: cffi, edlib, h5py, intervaltree, numpy, ont-fast5-api, ont-mappy, ont-parasail, pysam, pyspoa, requests, torch, tqdm, wurlitzer Required-by: (medaka) star@star-Z10PE-D16-WS:~/bioinfo/25559/trycycler_25559$ pip show torch Name: torch Version: 2.6.0 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: [email protected] License: BSD-3-Clause Location: /home/star/medaka/lib/python3.10/site-packages Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-cusparselt-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions Required-by: medaka, medaka-cpu
But same issue coming.
this command ( CUDA_VISIBLE_DEVICES="") need to do something in file or ...?
As your another recommend, I use '-b 10' lower than 100, It's works! Thank you for helping me!
If my data was basecalled with Dorado version 0.9.1 with super accurate (sup) model and Flowcell: FLO-MIN114 (R10.4.1) was used so what code should I use to run medaka. or I can no longer use it in 2025 ? my genome is 40mb and its a fungus.