bonito Extremely long runtime with trained models when compared to default modified basecalled model

Extremely long runtime with trained models when compared to default modified basecalled model

Open mvolar opened this issue 1 year ago • 0 comments

Hello again,

Now I have issues with speeds when using the "inhouse" models when compared to "pretrained" models.

Speed with inhouse:

> calling:   0%|1                                   | 26341/7455607 [48:10<226:26:33,  9.11 reads/s]
> calling:   0%|1                                   | 26347/7455607 [48:10<226:26:09,  9.11 reads/s]
> calling:   0%|1                                   | 26353/7455607 [48:11<226:27:05,  9.11 reads/s]
> calling:   0%|1                                   | 26359/7455607 [48:12<226:28:00,  9.11 reads/s]
> calling:   0%|1                                   | 26365/7455607 [48:13<226:27:18,  9.11 reads/s]
> calling:   0%|1                                   | 26371/7455607 [48:14<226:28:46,  9.11 reads/s]
> calling:   0%|1                                   | 26377/7455607 [48:15<226:30:18,  9.11 reads/s]

Speed with pretrained:

> calling:   0%|1                                    | 20831/7455607 [03:52<23:03:42, 89.55 reads/s]
> calling:   0%|1                                    | 20856/7455607 [03:53<23:04:21, 89.51 reads/s]
> calling:   0%|1                                    | 20881/7455607 [03:53<23:04:57, 89.47 reads/s]
> calling:   0%|1                                    | 20906/7455607 [03:53<23:05:39, 89.42 reads/s]
> calling:   0%|1                                    | 20931/7455607 [03:54<23:05:57, 89.41 reads/s]

I have also noticed that the pretrained models with previous version of remora/bonito (from PyPl) had a preprocessing reads part in the beginning, but my inhouse model had no such log in the beginning.

Both runs were run on the exact same GPU and cluster node, this is a problem since our sysadmin cuts any jobs that go >7days, and termination will result in a corrupted bam file, and I don't understand how can there be such a discrepancy when searching for exactly the same bases.

Is this a problem in the bonito itself or a problem with using custom models (they were trained on 80mil chunks)?

This is the header of my model:

    "creation_date": "03/06/2023, 22:11:23",
    "kmer_context_bases": [
        4,
        4
    ],
    "chunk_context": [
        50,
        50
    ],
    "base_pred": false,
    "mod_bases": "m",
    "refine_kmer_center_idx": -1,
    "refine_do_rough_rescale": 0,
    "refine_scale_iters": -1,
    "refine_algo": "dwell_penalty",
    "refine_half_bandwidth": 5,
    "base_start_justify": false,
    "offset": 0,
    "model_params": {
        "size": 64,
        "kmer_len": 9,
        "num_out": 2
    },
    "mod_long_names_0": "5mC",
    "num_motifs": "1",
    "motif_0": "N",
    "motif_offset_0": "0",
    "refine_kmer_levels": "\u0000\u0000\u2514\u007f",
    "refine_sd_arr": "\u0000\u00004A\u2550\u2560\u001cA33\u0007Agf\u00b5@\u00dc\u00d6\u2534@",
    "doc_string": "Nanopore Remora model",
    "model_version": 3

Mar 07 '23 15:03 mvolar

bonito bonito copied to clipboard

Extremely long runtime with trained models when compared to default modified basecalled model

bonito
bonito copied to clipboard