gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Very low CPU usage when running GATK GenotypeGVCFs

Open GooLey1025 opened this issue 5 months ago • 0 comments

Image

My question

I split into 12 chromosomes and parallel multiple jobs at a time. As the picture says, the process occupy just very low(1%) CPU usage per job, and so it ran very slow in my server.

My environment:

Slurm HPC server, shared file system might cause this? version: gatk-package-4.6.1.0-local.jar

My command:

sbatch_script="sbatch_${group_name}.sh"
    cat << EOL > $sbatch_script
#!/bin/bash
#SBATCH --ntasks=1          
#SBATCH --cpus-per-task=$threads        
#SBATCH --mem=$MEM               
#SBATCH --time=0             
#SBATCH --partition=CPU
#SBATCH --output=logs/$P.${group_name}.log
#SBATCH --error=logs/$P.${group_name}.log
threads=$threads
P=$P
group_file=$group_file

echo "Running parallel tasks for group: \$group_file"
parallel --tmpdir $gulei/TMP -j $N_tasks --halt 2 --delay 1 '
    chr=\$(echo {1} | awk -F"Chr" "{print \\\$2}" | awk -F":" "{print \\\$1}"); \
    gatk --java-options "-Xmx${JAVA_MEM}g -Djava.io.tmpdir=./tmp" \
    GenotypeGVCFs -R $ref -V gendb://genomeDB_$P/Chr\$chr \
    --max-genotype-count 2048 --genomicsdb-shared-posixfs-optimizations \
    -new-qual \
    -O $P/$P.Chr\$chr.raw.vcf.gz  1>$gatk_logs/$P.Chr\$chr.GenotypeGVCFs.log 2>&1
' ::: \$(cat \$group_file)
EOL
    sbatch $sbatch_script
done

GooLey1025 avatar May 13 '25 01:05 GooLey1025