AmpliconArchitect icon indicating copy to clipboard operation
AmpliconArchitect copied to clipboard

The bed file for AA is 0 size after running PrepareAA

Open tanzhengtang opened this issue 4 years ago • 2 comments

Hello,I meet a question about prepare the bed file for AA.

my command: python /home/tang/tools/PrepareAA/PrepareAA.py -s A549 -t 4 --cnvkit_dir /home/tang/tools/cnvkit/cnvkit.py --sorted_bam /home/tang/A549/align_data/low_depth_A549_hg19_align_sort_rmdup.bam --ref hg19 -o ./ --python3_path /home/tang/miniconda3/envs/ecDNA/bin/python

the file of process: process.txt

the output of low_depth_A549_hg19_align_sort_rmdup_CNV_GAIN.bed: chr5 17515656 17600656 CNVkit 11.8918395674 chr8 86556450 86841451 CNVkit 10.8396028955 chr15 21885000 21940357 CNVkit 6.31892875621 chr15 22297017 22591206 CNVkit 6.0158141596 chr17 21301608 21361608 CNVkit 5.59799423044 chr17 21506608 21686654 CNVkit 5.226082864 chr19 48403231 48463235 CNVkit 6.90307902477 chr19 50593385 50643388 CNVkit 7.40679804001

Although it produce A549_AA_CNV_SEEDS.bed,but it contain nothing.Is this right?Or what should I do to reslove this?

Thanks very much!

tanzhengtang avatar Dec 06 '20 12:12 tanzhengtang

Hi,

The regions identified by CNVkit are filtered by the AA script 'amplified_intervals.py', to remove regions that are too small to be candidate AA amplicons, too low in CN, or most importantly, those with large amounts of repetitive sequence content. This filtering steps are important to remove potential false positive regions, as even a normal genome will often show areas of sharp CN increase when aligned back to the reference, for reasons related to the reference genome and the mapping of the reads.

Best, Jens

jluebeck avatar Dec 07 '20 17:12 jluebeck

Hi,

The regions identified by CNVkit are filtered by the AA script 'amplified_intervals.py', to remove regions that are too small to be candidate AA amplicons, too low in CN, or most importantly, those with large amounts of repetitive sequence content. This filtering steps are important to remove potential false positive regions, as even a normal genome will often show areas of sharp CN increase when aligned back to the reference, for reasons related to the reference genome and the mapping of the reads.

Best, Jens

Ok,I get it.Thanks again!

tanzhengtang avatar Dec 08 '20 02:12 tanzhengtang