buttery-eel icon indicating copy to clipboard operation
buttery-eel copied to clipboard

Some difference in the thread model from v0.4.2->0.5.1?

Open hasindu2008 opened this issue 4 months ago • 3 comments

When tested by @kisarur on nci-Gadi with a command line like below:

buttery-eel -i /g/data/ox63/slow5-testdata/hg2_prom_lsk114_5khz_subsample/PGXXXX230339_reads_500k.blow5 -o /path/to/reads.fastq -g </path/to/basecaller/bin> --port 5000 --use_tcp --config <model.cfg> -x cuda:all --guppy_batchsize 20000 --max_queued_reads 20000 --slow5_threads 10 --slow5_batchsize <batchsize> --procs 20

An anomaly is observed when going from v0.4.2->0.5.1? For 0.4.2 eel with dorado server 7.2.13, it takes around 10 minutes, when using a slow5 batch size of 100. However, for 0.5.1 with dorado server 7.4.12, it takes 2.5 hours when using a slow5 batch size of 100, which comes down back to around 10 mins when the slow5 batch size is increased to 4000.

buttery-eel0.4.2+dorado7.2.13 with dna_r10.4.1_e8.2_400bps_5khz_hac_prom model with slow5_batchsize 100:
11.5 minutes

buttery-eel0.5.1+dorado7.4.12 with dna_r10.4.1_e8.2_400bps_5khz_hac model with slow5_batchsize 100:
2 hours 40 minutes!!!!

buttery-eel0.5.1+dorado7.4.12 with dna_r10.4.1_e8.2_400bps_5khz_hac model with slow5_batchsize 4000:
10.5 minutes

This batchsize impact on latest eel+dorado combination was also reproduced at https://github.com/hasindu2008/nci-scripts/issues/1#issuecomment-2392808983.

Any explanation? Is this because _prom models do not exist in the new Dorado? Or is it a new Dorado server-client? But not sure how the slow5 batch size is impacted by this. Something is weird.

hasindu2008 avatar Oct 09 '24 00:10 hasindu2008