mapDamage
mapDamage copied to clipboard
Plot_composition_2
Hello,
sorry for asking one more time, i've written in previouse topic but it was closed. https://github.com/ginolhac/mapDamage/issues/33
I've tried mapdamage tool alone and as the part of Paleomix tools and i have different fragmentation plots.
mapdamage alone (downsample 100000)
1.There are a lot of small dots for every chromosome. 2.there are peaks of frequency in the beginning of the merged sequence and there is plato line in the end.
mapdamage as the part of paleomix( default -downsample 100000)
- one only big dot for mean frequency.(the reference was the same with 22 chromosomes)
- there are peaks in the beginning and in the end of merged sequence.
The peaks in the beginning could be because of C->T changes,but why there are peaks in the end I don't understand. Could comment it? I've tried to find the detailed explanations in the article but the site geogenetics.ku.dk/publications/mapdamage2.0/ doesn't work.
A lot of thanks, Valery
Hi, you are right the geogenetics site is gone and that's unfortunate but not in our hands to fix that. For your question, "more of a comment than a question" I guess you realize we cannot know what you have run to obtain those 2 plots. My own guess is that the second one (with paleomix) was obtained using only collapsed paired-end reads, so full length templates. This allows to recover ancient DNA fragmentation signal on both ends. For the G>A missing I don't know
Unfortunately, both plots are for the same single-end reads. And the protocol was for single strand dna. Am I right that if I have single_end reads, I could see only C->T changes at 5' and the peaks on the top plots(composition) only in the beginning? I've found such recommendation in the article 2013: "If the sample is from a single strand library build preparation as described in Meyer et al., 2012 then we can run the mapDamage using the following command. mapDamage −i seq . bam −r ref . fa −−single_stranded You should see elevated C→T substitutions frequency at both ends in the posterior predictive plot with this option. If using the single ends instead of merged paired ends then we suggest using only the forward part of the sequences. mapDamage −i seq . bam −r ref . fa −−forward " So it means that if I have single stranded pair-end reads, I should see" elevated C→T substitutions frequency at both ends" If single stranded but single ends, so C→T substitutions only at 5' end. Yes?
A lot of thanks, Valery Upd. seems that there're only big dots on the plot of paleomix because of the flag --merge-reference-sequences. There is no such flag in mapdamage.