GangSTR icon indicating copy to clipboard operation
GangSTR copied to clipboard

Issue about "Not enough reads extracted. Skipping locus.."

Open tanlaboratory opened this issue 3 years ago • 1 comments

Hello GangSTR team,

I am trying to apply GangSTR on exome sequencing samples. However, I encountered an issue about 'Not enough reads extracted' across all the TR regions. Could you help me to figure it out? May I know how many reads are essential for a given TR region? (I can image the scenario as the figure1 in your NAR paper) The commands I used are listed below. Thank you in advance!

  • Calculate coverage for the given cram file
mosdepth -n --fast-mode \
        --fasta /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa \
        --by /path/to/xgen_plus_spikein.b38.bed \
        /path/to/${sampleID}.coverage \
        /path/to/${sampleID}.cram
  • Calculate the average coverage
gunzip -c /path/to/${sampleID}.coverage.regions.bed.gz |\
        sort -n -k 4 | awk '{ sum += $4; n++ } END { if (n > 0) print sum / n; }' \
        > /path/to/${sampleID}.avgcov
  • Calculate the insertmean and insertsdev using samtools
samtools stats \
        --reference /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa \
        --target-regions /path/to/xgen_plus_spikein.b38.bed \
        /path/to/${sampleID}.cram  1 \
        > /path/to/${sampleID}_chr1.stats
  • Run GangSTR
GangSTR --bam /path/to/${sampleID}.cram \
        --ref /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa \
        --regions /path/to/hg38_ver13.bed \
        --out /path/to/${sampleID}.vcf \
        --nonuniform \
        --coverage  43 \
        --readlength 76 --insertmean 164.9 --insertsdev 78.7 --targeted  #paired-end reads 76*2
  • Output errors
[GangSTR-2.5.0] ProgressMeter: Loading read group id sampleID for sample sampleID
[GangSTR-2.5.0] ProgressMeter: Processing chr1:14070
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chr1:16620
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chr1:22812
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chr1:26454
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chr1:31556
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
...
...
...
[GangSTR-2.5.0] ProgressMeter: Processing chrY:56886704
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chrY:56886966
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..
[GangSTR-2.5.0] ProgressMeter: Processing chrY:56887112
[GangSTR-2.5.0] ProgressMeter:  Not enough reads extracted. Skipping locus..

Meanwhile, the output VCF is empty with only header lines.

tanlaboratory avatar Dec 08 '21 19:12 tanlaboratory

Hi,I met the same problem. Have you solved it?

G2GreenTea avatar Jul 02 '22 14:07 G2GreenTea