Dorado model downloads incomplete and 'std::runtime_error'
Issue Report
Please describe the issue:
-
I see on the DNA models available there is now a v5.0.0. When I attempt to download all models I do not see these in the list of downloads. Is this a result of a update I need to make to the dorado software itself? I need to basecall 5k data but my current models are 4k - is the v5.0.0 able to do this?
-
When attempting to basecall using [email protected]_5mCG_5hmCG@v2 I am getting this error
terminate called after throwing an instance of 'std::runtime_error' what(): could not find matching modification model for [email protected]_5mCG_5hmCG@v2 Aborted
Please provide a clear and concise description of the issue you are seeing and the result you expect.
Steps to reproduce the issue:
qbiol@TA-PC-JARVIS:/mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin$ ./dorado basecaller /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/[email protected]_5mCG_5hmCG@v2 /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/pod5 --modified-bases 5mCG > BV2_basecalled.bam terminate called after throwing an instance of 'std::runtime_error' what(): could not find matching modification model for [email protected]_5mCG_5hmCG@v2 Aborted
Please list any steps to reproduce the issue.
Run environment:
- Dorado version: 0.3.4+5f5cd02
- Dorado command:
qbiol@TA-PC-JARVIS:/mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin$ ./dorado basecaller /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/[email protected]_5mCG_5hmCG@v2 /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/pod5 --modified-bases 5mCG > BV2_basecalled.bam - Operating system: Windows 10 64-bit operating system, x64-based processor
- Hardware (CPUs, Memory, GPUs): Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz 3.60 GHz
- Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5
- Source data location (on device or networked drive - NFS, etc.): on device
- Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): BV2 cells DNA, SQK-LSK114, 33GB
- Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):
Logs
- Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)
(base) qbiol@TA-PC-JARVIS:/mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin$ ./dorado basecaller /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/[email protected]_5mCG_5hmCG@v2 /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/pod5 --modified-bases 5mCG > BV2_basecalled_5mC.bam terminate called after throwing an instance of 'std::runtime_error' what(): could not find matching modification model for [email protected]_5mCG_5hmCG@v2
Hi @achang44,
Your version of dorado is too old - v5 models were added in 0.7.0+. Please update and try again.
Best regards, Rich
Also, do not put the modification model in the model parameter. This should be the simplex model - then supply the required modifications via either --modified-bases 5mCG_5hmCG or using the full path to the modification model using --modified-bases-models /mnt/c/Users/qbiol/Desktop/Nanopore/Software/dorado/dorado-0.3.4-linux-x64/bin/[email protected]_5mCG_5hmCG@v2.
See https://github.com/nanoporetech/dorado?tab=readme-ov-file#modified-basecalling