graphtyper icon indicating copy to clipboard operation
graphtyper copied to clipboard

graphtyper genotyper_sv :<error> No regions specified

Open y1025i opened this issue 3 years ago • 3 comments

Hi,

I have 400 WGS samples and use manta+graphtyper2 to call SV. But I got an error. graphtyper genotyper_sv : No regions specified. Either use --region or --region_file option to specify regions.

I want to know whether --region or --regio_file is necessary. If necessary, how can I get a regional file. I have no specific region and just want to call SV in whole genome.

Best! Yi

y1025i avatar Aug 18 '20 14:08 y1025i

You can just make a region file manually that lists every chromosome. Or you can pull the primary contigs out of your reference FASTA's index file, e.g.:

head -25 $FASTA.fai | cut -f 1 > regions.txt

seboyden avatar Aug 20 '20 00:08 seboyden

Thanks a lot! I made a region file one days ago, just like this: chr1:0-249904550 chr2:0-243199373 chr3:0-198022430 chr4 ....

  1. But I got many warnings: [W::hts_idx_load3] The index file is older than the data file: /bak/ncrcgd/bam/WGS_case/R19002727LD01-XG8736_sorted_dedup_realign.bam.bai [W::hts_idx_load3] The index file is older than the data file: /bak/ncrcgd/bam/WGS_case/R18030171LD01-XG8458_sorted_dedup_realign.bam.bai should I ignore these warnings?
  2. The running time is a little long. how can I reduce the running time if I have more than 4000 WGS samples.

Best! Yi

y1025i avatar Aug 20 '20 02:08 y1025i

You don't need the coordinates, you can just list the chromosomes if you want the whole chromosome. But, you can also parallelize by chromosome to make it run faster, i.e. run 25 jobs at the same time on a cluster and use --region=chr1 for job 1, --region=chr2 for job 2, etc. But with 4000 samples, you should expect long run times.

seboyden avatar Aug 20 '20 16:08 seboyden