cfDNApipe
cfDNApipe copied to clipboard
multi-core error at adapter removal
Hi, I am getting an error related with multi-core usage. I am executing cfdnapipe in a slurm cluster in a job with 24 cores available. A fraction of the log containing the error (happens for most files).
An Error Occured During The Following Command Line Executing.
^^^
AdapterRemoval --threads 24 --file1 /well/buck/users/xhs232/data_hcb/longwood_fastqs/DL99908_hu_S5_R1_001.fastq.gz --file2 /well/buck/users/xhs232/data_hcb/longwood_fastqs/DL99908_hu_S5_R2_001.fastq.gz --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCCACTGAAAAAAAAAATCTCGTATGCCGTCTTCTGCTTGAAAAATGGGGG --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTACGCACCTGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAGGGGGGGGGG --basename /well/buck/users/xhs232/analysis/cfdna_fragmentomics_cluster_24cores/intermediate_result/step_03_adapterremoval/DL99908huS5 --qualitybase 33 --gzip^^^
Trimming paired end reads ...
Opening FASTQ file '/well/buck/users/xhs232/data_hcb/longwood_fastqs/DL99908_hu_S5_R1_001.fastq.gz', line numbers start at 1
Opening FASTQ file '/well/buck/users/xhs232/data_hcb/longwood_fastqs/DL99908_hu_S5_R2_001.fastq.gz', line numbers start at 1
ERROR: Unhandled exception in thread:
basic_ios::clear: iostream error
ERROR: AdapterRemoval did not run to completion;
do NOT make use of resulting trimmed reads!
^^^
Please Stop The Program To Check The Error.
^^^
Traceback (most recent call last):
File "/well/buck/users/xhs232/analysis/cfdna_fragmentomics_cluster_24cores/scripts/cfdna_pipe_run.py", line 23, in <module>
report=True,
File "/well/buck/users/xhs232/conda/envs/cfDNApipe/lib/python3.6/site-packages/cfDNApipe/Pipeline.py", line 148, in cfDNAWGS
res_adapterremoval = adapterremoval(upstream=res_identifyAdapter, other_params=rmAdOP, verbose=verbose)
File "/well/buck/users/xhs232/conda/envs/cfDNApipe/lib/python3.6/site-packages/cfDNApipe/Fun_adapterremoval.py", line 248, in __init__
self.multiRun(args=all_cmd, func=None, nCore=1)
File "/well/buck/users/xhs232/conda/envs/cfDNApipe/lib/python3.6/site-packages/cfDNApipe/StepBase.py", line 672, in multiRun
raise commonError("Error occured in multi-core running!")
cfDNApipe.cfDNA_utils.commonError: Error occured in multi-core running!
The actual code that I executed is this one:
from cfDNApipe import *
pipeConfigure(
threads=24,
genome="hg38",
refdir=r"/well/buck/users/xhs232/references/cfdnapipe",
outdir=r"/well/buck/users/xhs232/analysis/cfdna_fragmentomics_cluster_24cores",
data="WGS",
type="paired",
build=True,
JavaMem="10g",
)
res = cfDNAWGS(
inputFolder=r"/well/buck/users/xhs232/data_hcb/longwood_fastqs",
idAdapter=True,
rmAdapter=True,
dudup=True,
CNV=True,
armCNV=True,
fragProfile=True,
verbose=False,
report=True,
)
Configure.snvRefCheck(folder="/well/buck/users/xhs232/references/cfdnapipe/hg38", build=True)
# Using bam files directly.
# Of course, the "upstream" of addRG can be from "rmduplicate".
res1 = addRG(upstream=res.rmduplicate)
res2 = BaseRecalibrator(upstream=res1, knownSitesDir=Configure.getConfig("snv.folder"))
res3 = BQSR(upstream=res2)
res4 = getPileup(upstream=res3, biallelicvcfInput=Configure.getConfig("snv.ref")["7"],)
res5 = contamination(upstream=res4)
res6 = mutect2t(
caseupstream=res5, vcfInput=Configure.getConfig("snv.ref")["6"], ponbedInput=Configure.getConfig("snv.ref")["8"],
)
res7 = filterMutectCalls(upstream=res6)
# ???
res8 = gatherVCF(upstream=res7)
# split somatic mutations
res9 = bcftoolsVCF(upstream=res8, stepNum="somatic")
# split germline mutations
res10 = bcftoolsVCF(upstream=res8, other_params={"-f": "'germline'"}, suffix="germline", stepNum="germline")