condition-transcript-expression question

Open steffenheyne opened this issue 8 years ago • 0 comments

Hi, thanks for isolator! I started playing around and it seems very useful!

I try to unterstand the different summarize functions.

My samples.yaml looks like this:

KO_young:
 RE_8wks_KO_01:  bam/RE_8wks_KO_01.bam
 RE_8wks_KO_04:  bam/RE_8wks_KO_04.bam
 RE_8wks_KO_05:  bam/RE_8wks_KO_05.bam

KO_old:
 RE_25wks_KO_02:  bam/RE_25wks_KO_02.bam
 RE_25wks_KO_03:  bam/RE_25wks_KO_03.bam

ctrl_young:
 RE_8wks_HET_01: bam/RE_8wks_HET_01.bam
 RE_8wks_HET_02: bam/RE_8wks_HET_02.bam
 RE_8wks_HET_03: bam/RE_8wks_HET_03.bam
 RE_8wks_WT_01: bam/RE_8wks_WT_01.bam
 RE_8wks_WT_03: bam/RE_8wks_WT_03.bam
 RE_8wks_WT_04: bam/RE_8wks_WT_04.bam

ctrl_old:
 RE_25wks_HET_01:  bam/RE_25wks_HET_01.bam
 RE_25wks_HET_02:  bam/RE_25wks_HET_02.bam
 RE_25wks_WT_02:  bam/RE_25wks_WT_02.bam
 RE_25wks_WT_03:  bam/RE_25wks_WT_03.bam
 RE_25wks_WT_04:  bam/RE_25wks_WT_04.bam

Now with isolator summarize condition-transcript-expression isolator-output.4_cond.h5

I get a file "condition-transcript-expression" starting with:

gene_name	gene_id	transcript_id	KO_young_adjusted_tpm	KO_young_adjusted_tpm	KO_young_adjusted_tpm	KO_old_adjusted_tpm
mt-Tf	ENSMUSG00000064336.1	ENSMUST00000082387.1	3.459316e-02	5.277997e-02	3.047849e-02	2.762534e-02
mt-Rnr1	ENSMUSG00000064337.1	ENSMUST00000082388.1	7.140579e+01	1.003876e+02	7.329236e+01	6.638102e+01
mt-Tv	ENSMUSG00000064338.1	ENSMUST00000082389.1	5.484299e-03	7.384523e-03	6.614153e-03	4.906841e-03
...

What are the columns 4-7? Why 3x the same column name? I would expect my 4 different conditions in the header, or?

Each column is the "mean" expression value of one condition?

What is the best way to get a "mean" expression per condition in a way that it matches (or something close with some simple approx.) the expression used to get "median_log2_fold_change" from a "differential-transcript-expression.tsv" file?

Thanks!

Jan 20 '17 09:01 steffenheyne