TPMCalculator icon indicating copy to clipboard operation
TPMCalculator copied to clipboard

Output read counts file?

Open imillercrews opened this issue 4 years ago • 0 comments

I noticed that if using multiple bam files it generates a table '_data_per_samples.txt' of TPM values for each gene/transcript for each sample. In order to replace FeatureCounts in a workflow, was wondering if it be useful to have a similar table of read counts generated?

That way you could use the TPM table to identify genes/transcripts that had low TPM across samples (e.g. you only want to analyze genes that had a TPM of 2 or greater in 75% of your samples), filter out those genes from the read counts table and then go onto some differential gene analysis directly. Wouldn't that save a step of having to merge '_gene.out' files across each sample if you wanted read counts?

Maybe a simple table such as:

Gene_Id Chr sample1_Count_Reads sample2_Count_Reads sample3_Count_Reads

imillercrews avatar Jan 27 '21 17:01 imillercrews