ShortStack
ShortStack copied to clipboard
Different libraries size
Hi Mike,
I have a question regarding the shortstack analysis from samples with a very different starting number of reads. In my analysis I have 4 different conditions, two of which has one of the replicates (out of 3) with a number of reads 2 times more important for one of the conditions and almost 4 time smore important for the other.
I did the shortstack analysis independently on the four conditions and I find more clusters in the one with the different librairires size compared to the others. I tried to correct this by keeping only clusters that are identified on all three replicaes with a minimum of 3 reads per replicate and I also filter using the coefficient of variation of the number of reads that it must be less than or equal to 50 % in each of the three replicates of each condition.
If I can manage the effect of the difference in size for the condition having a replicate with 4 times more reads (the variations between replicates are so important that the filtering based on the coefficient of variation works well), for the condition with a replicate with 2 times more reads, the impact of the difference in librairies size persists in the final number of clusters found.
I used the mincov parameter with rpm which according to the documentation would take into account the different in size of librairies but I'm not sure that this is enought ?
Can you tel me how I can do to deal with these differences and have resultats that I can compare between conditions ? Is there something that I didn't understand or did wrong ?
Thanks in advance !
Bernadette