tombo
tombo copied to clipboard
tombo preprocess annotate_raw_with_fastqs
Hello Everyone,
I am currently using Tombo version 1.5 on our uni HPC to analyze some bacterial modifications in DNA. Before we used fast5 data which included the basecalls so i could just start with the resquiggle Step and everything was working fine.
But our updated software separate the fastqs and fast5s. The fastqs are also gziped. So i just ungziped the fastqs and tried to annotate the fast5s with the fastqs from the same barcode (run).
I currently always get the Error:
Preparing reads and extracting read identifiers.
****** WARNING ****** Basecalls exsit in specified slot for some reads. Set --overwrite option to overwrite these basecalls.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 348/348 [00:39<00:00, 8.85it/s]
[19:31:46] Annotating FAST5s with sequence from FASTQs.
****** WARNING ****** Some FASTQ records contain read identifiers not found in any FAST5 files or sequencing summary files.
0it [01:03, ?it/s]
[19:32:50] Added sequences to a total of 0 reads.
So i looked this problem up but could not find a solution which was working for me. We are now rebasecalling the Data. But i would like to know if someone knows how this problem could be solved.
The Line i am using is: tombo preprocess annotate_raw_with_fastqs --overwrite --fast5-basedir /fast5s/ --fastq-filenames /fastqs/*.fastq
Is there a problem with having muti or singlefast5s ? Or should i look more into the sequencing settings to solve this problem. And the resquiggle command just gives me the Error that i am missing basecalls in my fast5 data.
I also looked into the final_summary file and i enabled basecalling so i dont understand why Tombo is saying that i am missing basecalls in my fast5.
instrument=MN39041 position= flow_cell_id=FAT59921 sample_id=Mho_4518_PG21 protocol_group_id=Mho_4518_PG21 protocol=sequencing/sequencing_MIN112_DNA_SQK-Q20EA:FLO-MIN112:SQK-NBD112-24 protocol_run_id=0aa23499-3d48-47f8-ac08-4ebd74be0aa5 acquisition_run_id=14dc527603366f0c24d86d62d46496a806336743 started=2023-02-10T16:09:52.073115+01:00 acquisition_stopped=2023-02-13T16:10:51.626632+01:00 processing_stopped=2023-02-13T16:11:32.102936+01:00 basecalling_enabled=1 sequencing_summary_file=sequencing_summary_FAT59921_0aa23499_14dc5276.txt fast5_files_in_final_dest=732 fast5_files_in_fallback=0 fastq_files_in_final_dest=755 fastq_files_in_fallback=0
So if anyone would have an idea how i could solve this problem or if i should provide any further information about my problem let me know.
kind regards,
Azlan