Using Modkit with Transcriptome-Aligned BAM Files
Hi,
I am working with modkit to analyze ONT direct RNA-seq data. I generated BAM files with modification tags aligned both to the genome and to the transcriptome. The reason for aligning to the transcriptome is that we would like to investigate RNA modifications at the transcript isoform level.
However, I noticed that the modkit pileup command runs extremely slowly on transcriptome-aligned BAM files. For example, when I tested the first sample, the analysis was still not finished even after more than 10 hours of running.
Here is the command I used:
/scratch/lb4489/project/dRNA/modkit/modkit pileup ./"$i"_GS2T_merged.bam ./"$i"_modkit.bed \
--log-filepath "$i".log \
--header --ref /scratch/lb4489/bioindex/gencode.v49.transcripts.fa \
Could you please let me know if there is anything wrong with the way I am running the command, or if there are adjustments I could make to improve the performance?
Hello @lbwfff The command you have here is fine. I'm currently working on a much faster version of the pileup algorithm that should help a lot. One thing you may want to try in increasing the number of threads -t, you can oversubscribe your machine a little and be OK since the transcriptome has a lot of short sequences. Hold tight, faster version coming.