dorado icon indicating copy to clipboard operation
dorado copied to clipboard

Progress scrolls instead of updates in place, runs slowly

Open RByers opened this issue 1 year ago • 4 comments

Issue Report

Please describe the issue:

About half the time I use Dorado, the progress updates scroll madly down the terminal with lots of blank lines interspersed. The other half the progress bar updates in place as expected. Also I think it's unusually slow when in this mode, and may also be correlated to crashes I'm seeing (exit with ERRORLEVEL 0xC0000409).

Here's a screenshot example when I freeze the scrolling (by selecting some text). This example happens to be a "resuming from file" status, but it is the same with regular basecalling status. image

Steps to reproduce the issue:

Unknown. Unclear why it sometimes works and sometimes doesn't.

Run environment:

  • Dorado version: 0.7.3+6e6c45cd+cu11080
  • Dorado command: ..\dorado\bin\dorado.exe basecaller hac@latest c:\data\Q8\SQ8\20240830_1109_MN41817_FAZ83080_1e1c31e4\pod5\ --kit-name SQK-NBD111-24 --trim adapters --resume-from Q8a-hac-aborted.bam > Q8a-hac.bam
  • Operating system: Windows 11 Home
  • Hardware (CPUs, Memory, GPUs): Intel Core i7, 16 GB RAM, NVidia GEForce RTX 3060
  • Source data type: pod5
  • Source data location: local disk
  • Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): MinIon nearly full run, ~100 GB
  • Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue): Unclear what reproduces it - doesn't seem specific to the data (though maybe happens less frequently when I pick a single file?). Cancelling and immediately retrying the exact same command sometimes works.

Logs

"c:\data\Q8\SQ8\20240830_1109_MN41817_FAZ83080_1e1c31e4\pod5\" "--kit-name" "SQK-NBD111-24" "--trim" "adapters" "--resume-from" "Q8a-hac-aborted.bam"
[2024-09-04 22:29:39.411] [info]  - downloading [email protected] with httplib
[2024-09-04 22:29:40.224] [info] Normalised: chunksize 10000 -> 9996
[2024-09-04 22:29:40.225] [info] Normalised: overlap 500 -> 498
[2024-09-04 22:29:40.226] [info] > Creating basecall pipeline
[2024-09-04 22:29:40.226] [debug] CRFModelConfig { qscale:1.050000 qbias:-0.600000 stride:6 bias:0 clamp:1 out_features:-1 state_len:4 outsize:1024 blank_score:2.000000 scale:1.000000 num_features:1 sample_rate:5000 mean_qscore_start_pos:60 SignalNormalisationParams { strategy:pa StandardisationScalingParams { standardise:1 mean:93.692398 stdev:23.506744}} BasecallerParams { chunk_size:9996 overlap:498 batch_size:0} convs: { 0: ConvParams { insize:1 size:16 winlen:5 stride:1 activation:swish} 1: ConvParams { insize:16 size:16 winlen:5 stride:1 activation:swish} 2: ConvParams { insize:16 size:384 winlen:19 stride:6 activation:tanh}} model_type: lstm { bias:0 outsize:1024 blank_score:2.000000 scale:1.000000}}
[2024-09-04 22:29:40.637] [debug] cuda:0 memory available: 5.33GB
[2024-09-04 22:29:40.637] [debug] cuda:0 memory limit 4.33GB
[2024-09-04 22:29:40.638] [debug] cuda:0 maximum safe estimated batch size at chunk size 9996 is 640
[2024-09-04 22:29:40.638] [debug] cuda:0 maximum safe estimated batch size at chunk size 4998 is 1280
[2024-09-04 22:29:40.638] [debug] Auto batchsize cuda:0: testing up to 1280 in steps of 64
[2024-09-04 22:29:40.722] [debug] Auto batchsize cuda:0: 64, time per chunk 0.484224 ms
[2024-09-04 22:29:40.791] [debug] Auto batchsize cuda:0: 128, time per chunk 0.232319 ms
[2024-09-04 22:29:40.891] [debug] Auto batchsize cuda:0: 192, time per chunk 0.250123 ms
[2024-09-04 22:29:41.011] [debug] Auto batchsize cuda:0: 256, time per chunk 0.221412 ms
[2024-09-04 22:29:41.171] [debug] Auto batchsize cuda:0: 320, time per chunk 0.235394 ms
[2024-09-04 22:29:41.339] [debug] Auto batchsize cuda:0: 384, time per chunk 0.204975 ms
[2024-09-04 22:29:41.541] [debug] Auto batchsize cuda:0: 448, time per chunk 0.214215 ms
[2024-09-04 22:29:41.761] [debug] Auto batchsize cuda:0: 512, time per chunk 0.203446 ms
[2024-09-04 22:29:42.024] [debug] Auto batchsize cuda:0: 576, time per chunk 0.217572 ms
[2024-09-04 22:29:42.300] [debug] Auto batchsize cuda:0: 640, time per chunk 0.205259 ms
[2024-09-04 22:29:42.599] [debug] Auto batchsize cuda:0: 704, time per chunk 0.196926 ms
[2024-09-04 22:29:42.910] [debug] Auto batchsize cuda:0: 768, time per chunk 0.188587 ms
[2024-09-04 22:29:43.260] [debug] Auto batchsize cuda:0: 832, time per chunk 0.198240 ms
[2024-09-04 22:29:43.626] [debug] Auto batchsize cuda:0: 896, time per chunk 0.190146 ms
[2024-09-04 22:29:44.025] [debug] Auto batchsize cuda:0: 960, time per chunk 0.193588 ms
[2024-09-04 22:29:44.440] [debug] Auto batchsize cuda:0: 1024, time per chunk 0.187385 ms
[2024-09-04 22:29:44.896] [debug] Auto batchsize cuda:0: 1088, time per chunk 0.193474 ms
[2024-09-04 22:29:45.364] [debug] Auto batchsize cuda:0: 1152, time per chunk 0.186017 ms
[2024-09-04 22:29:45.791] [debug] Auto batchsize cuda:0: 1216, time per chunk 0.161354 ms
[2024-09-04 22:29:46.223] [debug] Auto batchsize cuda:0: 1280, time per chunk 0.155547 ms
[2024-09-04 22:29:46.232] [debug] Largest batch size for cuda:0: 1280, time per chunk 0.155547 ms
[2024-09-04 22:29:46.233] [debug] Final batch size for cuda:0[0]: 512
[2024-09-04 22:29:46.233] [debug] Final batch size for cuda:0[1]: 1280
[2024-09-04 22:29:46.233] [info] cuda:0 using chunk size 9996, batch size 512
[2024-09-04 22:29:46.233] [debug] cuda:0 Model memory 2.40GB
[2024-09-04 22:29:46.233] [debug] cuda:0 Decode memory 0.99GB
[2024-09-04 22:29:46.856] [info] cuda:0 using chunk size 4998, batch size 1280
[2024-09-04 22:29:46.856] [debug] cuda:0 Model memory 3.00GB
[2024-09-04 22:29:46.856] [debug] cuda:0 Decode memory 1.24GB
[2024-09-04 22:29:48.802] [debug] BasecallerNode chunk size 9996
[2024-09-04 22:29:48.802] [debug] BasecallerNode chunk size 4998
[2024-09-04 22:29:48.813] [info] > Inspecting resume file...
[2024-09-04 22:29:48.814] [info] Resuming from file Q8a-hac-aborted.bam...
[2024-09-04 22:29:55.139] [info] > 422094 original read ids found in resume file.
[2024-09-04 22:29:55.383] [debug] Load reads from file c:\data\Q8\SQ8\20240830_1109_MN41817_FAZ83080_1e1c31e4\pod5\FAZ83080_1e1c31e4_e3695847_0.pod5
[2024-09-04 22:29:56.802] [debug] > Kits to evaluate: 1asecalling

RByers avatar Sep 05 '24 02:09 RByers