remora
remora copied to clipboard
Regarding m6A modified basecalling
Hi,
I was trying to do m6A modified basecalling of data generated from R9 flowcell by using trained RNN model i.e. res_dna_r941_min_modbases-all-context_v001 by using guppy v3.5.1. But, unfortunately, I was unable to do modified basecalling. The basecalling was completed, but unfortunately the bam file generated after minimap2 was without any modifications. It would be nice to have your insights on the same or is there a alternative way to do modified basecalling for m6A.
Thanks
Minimap2 requires that the modified base tags be carried through the alignment stage. This can be completed with the -y
option to minimap. If this does not resolve the issue, could you post the full set of commands from the raw data through to the desired output?
Hello @marcus1487 the -y option with minimap did not work out. The full set of commands are as follows: guppy_basecaller -i <FAST5> -s <OUTPUT> -c ~/workspace/rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001.cfg --recursive --calib_reference ./PlasmoDB-61_PvivaxSal1_Genome.fasta -x cuda:0 -m ~/workspace/rerio/basecall_models/res_dna_r941_min_modbases-all-context_v001.jsn --barcode_kits "EXP-NBD104" cat fastq_* > bigfile.fastq minimap2 -t 8 -ay -x map-ont <FASTA> bigfile.fastq -o CE.sam samtools fixmate -O bam,level=1 -m CE.sam fixmate.bam samtools sort -l 1 -@8 -o pos.srt.bam -T a_prefix fixmate.bam samtools markdup -O bam,level=1 pos.srt.bam markdup.bam samtools view -@8 markdup.bam -o final.bam samtools index final.bam modkit pileup final.bam pileup.bed
Can you check at which stage you are losing the modified base calls starting with the output of guppy? Methods for confirming modified base tags can be found in the related thread on modkit. https://github.com/nanoporetech/modkit/issues/159
@PRIYANKA-22091995 Once you know which step is dropping the tags, I can advise you on how to use modkit repair to potentially replace the tags. Here are two awk
commands that should pull out the relevent text:
awk 'match($0,/Mm:Z.*/) {print substr($0,RSTART,RLENGTH)}' ${file} | cut -f 1
awk 'NR %4 == 1 && $0 ~ /Mm:Z:/'
Hello @ArtRand the two awk commands was run on fastq files and it gave no output.
Thanks