AmpliconSuite-pipeline icon indicating copy to clipboard operation
AmpliconSuite-pipeline copied to clipboard

In most runs AA_CNV_SEEDS.bed files are empty

Open Yumo-Xie opened this issue 2 years ago • 7 comments

Hi. I have applied the program in WGS data of several cell lines. I started from .fastq files and used PrepareAA.py to generate CNV calls. However, all of these runs generated empty AA_CNV_SEEDS.bed files. I tried the recommended GBM39 testing data [https://www.ncbi.nlm.nih.gov/sra/SRX5055022[accn]]. By this time the program found 2 amplicons (one with EGFR and the other with MYC and PVT1) GBM39_amplicon1.pdf GBM39_amplicon2.pdf. Is the result correct? Does that mean my program work just fine, and the empty AA_CNV_SEEDS.bed files are attributed to the data I used?

Yumo-Xie avatar Jan 05 '23 03:01 Yumo-Xie

Hi,

Your GBM39 test results appear correct. An empty seeds bed file implies there are no candidate regions of focal amplification that are detected in those samples. There is also a finish_flag file which you can check to see if AmpliconSuite-pipeline completed successfully.

Thanks, Jens

jluebeck avatar Jan 06 '23 23:01 jluebeck

Thank you very much! The program also worked well for COLO320DM. It seems that the empty files are attributed to my data.

Yumo-Xie avatar Jan 08 '23 07:01 Yumo-Xie

Hi, my output file is also empty. -rw-r--r-- 1 xxx 0 Feb 9 16:50 6605D_AA_CNV_SEEDS.bed And my finish_flag file appears to be running successfully.

$ cat 6605D_finish_flag.txt
All stages completed
$ cat ./6605D_AA_results/6605D_summary.txt
#Amplicons = 0
-----------------------------------------------------------------------------------------

6605D_AA_OUT]$ cat ./6605D_classification/6605D_amplicon_classification_profiles.tsv
sample_name     amplicon_number amplicon_decomposition_class    ecDNA+  BFB+    ecDNA_amplicons

I tried several other WGS files with the same results, without cycle files, png or pdf files, etc. my command is /Parastor300s_G30S/zhangjj/software/miniconda3/bin/python3 /parastor300/work01/zhangjj/software/AmpliconSuite-pipeline/PrepareAA.py -s 6605D -t 50 --cnvkit_dir /parastor300/work01/zhangjj/software/cnvkit/cnvkit.py --bam 6605D.bam --ref GRCh38 --downsample 10.0 -o 6605D_AA_OUT --run_AA --run_AC My WGS data is 30X. Is this problem due to downsampling to 10x or something else? ps. My data are from healthy people, not cancer patients.

jingydz avatar Feb 09 '23 14:02 jingydz

Hi,

Your outputs appear to be correct. Keep in mind that focal amplifications almost exclusively occur in cancer and pre-cancer samples. If you are providing samples from healthy patients to AmpliconSuite, and it does not find any focal amplifications, then this is completely expected.

If you would like to try a cancer cell line, I suggest you try COLO320DM.

Thanks, Jens

jluebeck avatar Feb 09 '23 16:02 jluebeck

Thanks, I have run the WGS data of 39 healthy people and got 4 files AA_CNV_SEEDS.bed with content so far. image I also tried the COLO320DM cancer cell line, and it did find a lot of focal amplification, which should indeed be the problem with my data, thank you. time /Parastor300s_G30S/zhangjj/software/miniconda3/bin/python3 /parastor300/work01/zhangjj/software/AmpliconSuite-pipeline/PrepareAA.py -s COLO320DM -t 10 --cnvkit_dir /parastor300/work01/zhangjj/software/cnvkit/cnvkit.py --fastqs COLO320DM_r1.fastq.gz COLO320DM_r2.fastq.gz --ref hg38 -o COLO320DM_AA_OUT --run_AA --run_AC image

jingydz avatar Feb 10 '23 11:02 jingydz

Hi,

Your outputs appear to be correct. Keep in mind that focal amplifications almost exclusively occur in cancer and pre-cancer samples. If you are providing samples from healthy patients to AmpliconSuite, and it does not find any focal amplifications, then this is completely expected.

If you would like to try a cancer cell line, I suggest you try COLO320DM.

Thanks, Jens

Dear Jens, Is it possible to detect extrachromosomal circular DNA (eccDNA) in plasma samples from patients with specific chronic diseases? Thanks!

iamyingzhou avatar Jun 08 '23 09:06 iamyingzhou

Hi Yingzhou,

AA is designed to detect large (>10kbp), focally amplified ecDNA. If the eccDNA in question are smaller, or if they are not amplified then AA will very likely not detect them.

Thanks, Jens

jluebeck avatar Jun 08 '23 15:06 jluebeck