AmpliconSuite-pipeline icon indicating copy to clipboard operation
AmpliconSuite-pipeline copied to clipboard

Long run time with *_CNV_CALLS_pre_filtered.bed

Open WeijiaSu opened this issue 1 year ago • 4 comments

Hi Jens, I am using AmpliconSuite for a set of cancer WGS data. I have 36 samples, most of them finished successfully. But there are 10 of them that have been running for 5 days. And still not finished yet. I checked the *_CNV_CALLS_pre_filtered.bed. I think for these ones, they have 50-100 entries in the bed files. I wonder if this was the problem. I read the README, it says "if you notice there are > 50 CNV seeds going into AA, there may be something wrong." I assume, the bed files are a little large, but <100 entries are still on a reasonable scale? If this is the issue, do you think it is ok to re-run AA for these 10 samples and split their bed files into two (so there are <50 entries)? My command line is:

$AASuite"PrepareAA.py" -s $name -t 32 --cnvkit_dir /anaconda3/bin/cnvkit.py --fastqs $name"_R1.fastq.gz" $$name"_R2.fastq.gz" --ref hg38 --cnsize_min 500 --downsample -1 --run_AA --run_AC

I used the same command line for all 36 samples. And all the samples have similar fastq sizes as input.

Thanks for your help. Weijia

WeijiaSu avatar Mar 07 '23 19:03 WeijiaSu