AmpliconArchitect icon indicating copy to clipboard operation
AmpliconArchitect copied to clipboard

Reproducibility of AmpliconArchitect

Open EunchongHuang opened this issue 2 years ago • 1 comments

Dear AmpliconArchitect Team, I would like to ask you few questions regarding the samples you used in the manuscript. I used same WGS data from BioProject (accession number: PRJNA437014, only KT samples) and ran prepareAA in default mode since I'm less experienced with AA. At last, I obtained the numbers of amplicons and oncogenes amplified within each samples and compared my results with your results from figshare.

I found out that only KT22, KT26, KT31 , KT32, KT33, KT34 had so different results (rest of them resulted with no amplicon nor oncogenes amplified). Can you please tell me why there is such a difference? If it is okay, could you please tell me the actual parameter set when you run the pipeline?

EunchongHuang avatar Sep 02 '22 08:09 EunchongHuang

Hi,

Can you clarify if it was KT22, KT26, KT31 , KT32, KT33, KT34 which you found to be the same, or if it those samples were different? Perhaps, would you be able to share the output files, particularly log files & stdout (PAA & AA) for a representative example, such as KT11? Would you also be able to share the exact commands you ran? If this is too much information to put into a comment, you can email me at jluebeck [a t ] ucsd.edu

One difference between the 2019 paper and current best practices, is that the CNV tool readDepth was used for CNV seeding in the 2019 paper, and PrepareAA typically uses CNVKit (though users can provide their own CNV calls as well).

If you downloaded BAM files directly from SRA, keep in mind that SRA will strip certain tags from the BAM files, so for best reproducibility, re-alignment of fastq files with bwa mem is somewhat better.

Thanks, Jens

jluebeck avatar Sep 02 '22 16:09 jluebeck