pycoQC
pycoQC copied to clipboard
dorado summary as input to pycoQC
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] i want to be able to use dorado summary tsv files input to pycoQC.
Describe the solution you'd like A clear and concise description of what you want to happen. column names do not match. pls see error thrown by pycoQC
(pycoQC) bash:iscb011:/data1/greenbab/users/ahunos/apps/sandbox 1033 $ pycoQC -f /data1/greenbab/projects/methyl_benchmark_spectrum/data/preprocessed/009N_rerun/results/pod5stats/009N_2/009N_2_pod5_stats.tsv \
> -a /data1/greenbab/projects/methyl_benchmark_spectrum/data/preprocessed/009N_rerun/modbasecalls/mergebam_modkit/results/mark_duplicates/009N/009N_modBaseCalls_sorted_dup.bam \
> -o 009N_merged_pycoQC_output.html
Checking arguments values
Check input data files
Parse data files
Traceback (most recent call last):
File "/home/ahunos/miniforge3/envs/pycoQC/bin/pycoQC", line 8, in <module>
sys.exit(main_pycoQC())
File "/home/ahunos/miniforge3/envs/pycoQC/lib/python3.6/site-packages/pycoQC/__main__.py", line 132, in main_pycoQC
quiet = args.quiet)
File "/home/ahunos/miniforge3/envs/pycoQC/lib/python3.6/site-packages/pycoQC/pycoQC.py", line 129, in pycoQC
quiet=quiet)
File "/home/ahunos/miniforge3/envs/pycoQC/lib/python3.6/site-packages/pycoQC/pycoQC_parse.py", line 96, in __init__
summary_reads_df = self._parse_summary()
File "/home/ahunos/miniforge3/envs/pycoQC/lib/python3.6/site-packages/pycoQC/pycoQC_parse.py", line 139, in _parse_summary
optional_colnames = ["calibration", "barcode"])
File "/home/ahunos/miniforge3/envs/pycoQC/lib/python3.6/site-packages/pycoQC/pycoQC_parse.py", line 397, in _select_df_columns
raise pycoQCError("Column {} not found in the provided sequence_summary file".format(col))
pycoQC.common.pycoQCError: Column read_len not found in the provided sequence_summary file
pls see attached head of summary file from dorado
(pycoQC) bash:iscb011:/data1/greenbab/users/ahunos/apps/sandbox 1035 $ head -n 2 /data1/greenbab/projects/methyl_benchmark_spectrum/data/preprocessed/009N_re
run/results/pod5stats/009N_2/009N_2_pod5_stats.tsv
read_id filename read_number channel mux end_reason start_time start_sample duration num_samples minknow_events sample_rate median_before predicted_scaling_scale predicted_scaling_shift tracked_scaling_scale tracked_scaling_shift num_reads_since_mux_change time_since_mux_change run_id sample_id experiment_id flow_cell_id pore_type
0022b052-9be7-465d-a521-a23e13ab0309 009N_2.pod5 1911 1347 3 signal_positive 18064.48000000 72257920 0.88500000 3540 0 4000 200.35083008 NaN NaN NaN NaN 0 0.00000000 372e3d9e-cb76-4a59-b378-74df73a6bd3a 7N-2 not_set PAM57680 not_set
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.