Circle-Map icon indicating copy to clipboard operation
Circle-Map copied to clipboard

Questions of Circle-Map Repeats and Realignment

Open Bio-MingChen opened this issue 3 years ago • 1 comments

I am new to eccDNA identification .After reading the two articles of Circle-Map and tutorials, I have some questions of using Repeats subcommand : Why Repeats need not to extract discordant paired reads and split reads before calling circle DNA ? and how should I combine Repeats and Realign these two commands to eccDNA calling? And finally,I find a words that circular DNA interval overlapping more than 50% will be merged iterablely in the article on Nucleic acid Research,how this process is implimented? Is there a software to do this? Sorry for so much questions,Looking forward for your replay! Best wishes

Bio-MingChen avatar Sep 01 '20 06:09 Bio-MingChen

Dear @Bio-MingChen,

Thanks a lot for your interest in our work. And do not be sorry for asking the questions. I am glad to help, If I am able to.

Regarding the Circle-Map Repeats algorithm:

Many circular DNA are formed by the homologous recombination machinery from direct repeats of the genome. This circles are very hard to detect as the reads from the breakpoint align to too many genomic regions. Hence, split reads and discordant reads are not a good signal to detect them, as they fail to align well. Instead, we can take advantage of the fact that reads formed from repetitive circular DNA will align to too many places. Reads with multiple alignments will allow us to approximate the boundaries of the breakpoints. Then, we can use read coverage within the circle coordinates as a hard filter to get a robust set of repetitive DNA calls.

Regarding:

how should I combine Repeats and Realign these two commands to eccDNA calling?

You can just concatenate the files. They might be a bit redundant, but based on my experience they are not.

Regarding:

I find a words that circular DNA interval overlapping more than 50% will be merged iterablely in the article on Nucleic acid Research,how this process is implimented?

I do not think that you need to worry about this. The data on our NAR paper was analyzed with an old version of Circle-Map that tend to overcall circles on high coverage regions, that would more likely be explained by a single circular DNA. On the paper, we handled this issue by merging the intervals. However, in our BMC Bioinformatics paper (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3160-3), Circle-Map implements allele frequency filters to handle this issue. In any case, If you think that this is a problem in your data, and that you will benefit from applying this merging filter, drop me a mail. I will be happy to share a piece of code that will merge circles.

Best wishes and happy circle hunting,

Iñigo

iprada avatar Sep 02 '20 09:09 iprada