IsoQuant icon indicating copy to clipboard operation
IsoQuant copied to clipboard

How to Process or Filter the Merged GTF File After Using Isoquant

Open lebronzhong opened this issue 1 year ago • 3 comments

Hi,

First, I want to express my appreciation for the isoquant tool; it has been incredibly helpful and time-saving.

I have a question regarding the post-processing of GTF files. After running isoquant on all my samples, I obtained a transcript_model.gtf file for each sample. I then used gffcompare to merge all these GTF files, resulting in a consolidated GTF file.

My question is: How should I go about removing redundancy or filtering the merged GTF file? I have reviewed several papers but couldn't find specific details on this step.

Any guidance would be greatly appreciated.

Thank you very much!

Zhong

lebronzhong avatar Aug 27 '24 11:08 lebronzhong

Dear @lebronzhong

Thank you for the feedback.

I think gffcompare removes redundant transcript by itself, and if I'm not mistaken, there are a few options available about merging strategy, but I'm not the best person to ask.

I can also suggest you running IsoQuant while providing all your samples together. This way you will obtain a single GTF without redundant transcripts. This will also generate a per-sample count table.

Best Andrey

andrewprzh avatar Aug 29 '24 12:08 andrewprzh

Dear Andrey,

My sample is divided into 5 large groups, each containing 165 BAM files. I want to generate a final GTF file. Do I need to import it into YAML? Thank you.

Best, Zhong

lebronzhong avatar Aug 29 '24 16:08 lebronzhong

@lebronzhong

Wow, that's a large dataset.

Yes, you can provide them though YAML or command line, just make sure they are all treated as one experiment and a single output is generated.

Also, reading that many BAM files at once from the disk might be suboptimal in terms of running time. Merging them into a single or 5 BAMs might be beneficial.

Best Andrey

andrewprzh avatar Sep 02 '24 17:09 andrewprzh