How to Process or Filter the Merged GTF File After Using Isoquant
Hi,
First, I want to express my appreciation for the isoquant tool; it has been incredibly helpful and time-saving.
I have a question regarding the post-processing of GTF files. After running isoquant on all my samples, I obtained a transcript_model.gtf file for each sample. I then used gffcompare to merge all these GTF files, resulting in a consolidated GTF file.
My question is: How should I go about removing redundancy or filtering the merged GTF file? I have reviewed several papers but couldn't find specific details on this step.
Any guidance would be greatly appreciated.
Thank you very much!
Zhong
Dear @lebronzhong
Thank you for the feedback.
I think gffcompare removes redundant transcript by itself, and if I'm not mistaken, there are a few options available about merging strategy, but I'm not the best person to ask.
I can also suggest you running IsoQuant while providing all your samples together. This way you will obtain a single GTF without redundant transcripts. This will also generate a per-sample count table.
Best Andrey
Dear Andrey,
My sample is divided into 5 large groups, each containing 165 BAM files. I want to generate a final GTF file. Do I need to import it into YAML? Thank you.
Best, Zhong
@lebronzhong
Wow, that's a large dataset.
Yes, you can provide them though YAML or command line, just make sure they are all treated as one experiment and a single output is generated.
Also, reading that many BAM files at once from the disk might be suboptimal in terms of running time. Merging them into a single or 5 BAMs might be beneficial.
Best Andrey