megalodon KeyError: 'stratify_type is not a file in the archive'

Hi Marcus,

I was following "Modified Base in Known Context" pipeline. I have finished calibration and generated "megalodon_mod_calibration.npz".

Next I would like to re-run megalodon again with the newly taiyaki-trained model, with the calibration file. But I met an error:

... self._load_calibration() File "/clusterdata/uqjxu8/scratch/anaconda3/envs/vac/lib/python3.9/site-packages/megalodon/calibration.py", line 562, in _load_calibration self.stratify_type = str(calib_data[MOD_STRAT_TYPE_TXT]) File "/clusterdata/uqjxu8/scratch/anaconda3/envs/vac/lib/python3.9/site-packages/numpy/lib/npyio.py", line 260, in getitem raise KeyError("%s is not a file in the archive" % key) KeyError: 'stratify_type is not a file in the archive' srun: error: gpunode-0-7: task 0: Exited with exit code 1 ...

Compared to last successful run, I just changed --disable-mod-calibration to --mod-calibration-filename $WDR/megalodon_calibrated/mod_calibration_statistics.npz:

megalodon $WDR/data/drna/20220321_VAC_mU_fast5 --outputs basecalls mappings mod_mappings mods per_read_mods --output-directory $WDR/megalodon_validate/20220321_VAC_mU_fast5 --reference $WDR/reference_sequence/2021-8_pBASE1_eGFP_sequence_2.fasta --guppy-server-path $WDR/ont-guppy/bin/guppy_basecall_server --rna --guppy-config rna_r9.4.1_70bps_hac_mU.cfg --mod-calibration-filename $WDR/megalodon_calibrated/mod_calibration_statistics.npz --devices 0,1 --processes 40 --overwrite

Would you have any advice for that, please? Thanks, Jon

Jun 16 '22 05:06 jon-xu

Could you confirm the contents of the calibration file? The following short python command should expose these values:

import numpy as np
with np.load("$WDR/megalodon_calibrated/mod_calibration_statistics.npz") as calib_fp:
    print(calib_fp.files)

Jul 06 '22 22:07 marcus1487

Hi Marcus, The contents are:

['all_mod_bases', 'u_mod_llrs', 'u_can_llrs']

Jul 11 '22 00:07 jon-xu

What was the output of the calibrate command? In the code the command to save the calibration information should contain the stratification type (see here). I'm not sure how this could be bypassed by that command.

Jul 18 '22 21:07 marcus1487

So the error log of the batch file looks like this:

(base) [uqjxu8@wiener vac]$ cat logs/megalodon_calibrate_error.txt [20:33:16] Parsing log-likelihood ratios [20:33:21] Computing u modified base calibration. [20:33:21] Computing reference emperical density. 100%|██████████| 396714939/396714939 [3:07:58<00:00, 35175.07it/s]] ] [23:41:20] Computing alternative emperical density. 100%|██████████| 105673463/105673463 [49:50<00:00, 35339.26it/s]] [00:31:10] Setting new input llr range for more robust calibration (-23, 3) [00:31:10] Computing new reference emperical density. 100%|██████████| 396714939/396714939 [3:00:59<00:00, 36530.77it/s]] ] [03:32:10] Computing new alternative emperical density. 100%|██████████| 105673463/105673463 [46:06<00:00, 38191.69it/s] [04:18:17] Saving calibrations to file.

The output log is empty.

The output of "calibrate generate_modified_base_stats" is "mod_calibration_statistics.npz" And that of "calibrate modified_bases" is "calibrated.pdf"

Thanks!

Jul 18 '22 23:07 jon-xu

Just attached the two result files here:

https://cloudstor.aarnet.edu.au/plus/s/KHg7s6wsU0aGjU0 calibrated.pdf

Jul 18 '22 23:07 jon-xu

Hi Marcus, did you have a chance looking into the attached result files, please?

Oct 04 '22 05:10 jon-xu

megalodon megalodon copied to clipboard

KeyError: 'stratify_type is not a file in the archive'

megalodon
megalodon copied to clipboard