ShortStack icon indicating copy to clipboard operation
ShortStack copied to clipboard

Different libraries size

Open RubioB opened this issue 3 years ago • 0 comments

Hi Mike,

I have a question regarding the shortstack analysis from samples with a very different starting number of reads. In my analysis I have 4 different conditions, two of which has one of the replicates (out of 3) with a number of reads 2 times more important for one of the conditions and almost 4 time smore important for the other.

I did the shortstack analysis independently on the four conditions and I find more clusters in the one with the different librairires size compared to the others. I tried to correct this by keeping only clusters that are identified on all three replicaes with a minimum of 3 reads per replicate and I also filter using the coefficient of variation of the number of reads that it must be less than or equal to 50 % in each of the three replicates of each condition.

If I can manage the effect of the difference in size for the condition having a replicate with 4 times more reads (the variations between replicates are so important that the filtering based on the coefficient of variation works well), for the condition with a replicate with 2 times more reads, the impact of the difference in librairies size persists in the final number of clusters found.

I used the mincov parameter with rpm which according to the documentation would take into account the different in size of librairies but I'm not sure that this is enought ?

Can you tel me how I can do to deal with these differences and have resultats that I can compare between conditions ? Is there something that I didn't understand or did wrong ?

Thanks in advance !

Bernadette

RubioB avatar Jul 28 '21 14:07 RubioB