rfmix
rfmix copied to clipboard
Segmentation during scanning for optimal CRF weight
Hello:
RFMIX v2.03-r0 - Local Ancestry and Admixture Inference (c) 2016, 2017 Mark Koni Hamilton Wright Bustamante Lab - Stanford University School of Medicine Based on concepts developed in RFMIX v1 by Brian Keith Maples, et al.
This version is licensed for non-commercial academic research use only For commercial licensing, please contact [email protected]
--- For use in scientific publications please cite original publication --- Brian Maples, Simon Gravel, Eimear E. Kenny, and Carlos D. Bustamante (2013). RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am. J. Hum. Genet. 93, 278-288
Loading genetic map for chromosome Chr1 ... done
Mapping samples ... 29 samples combined
Scanning input VCFs for common SNPs on chromosome Chr1 ... 956161 SNPs
Loading haplotypes... done
Defining and initializing conditional random field...
setting up CRF points and random forest windows...
computing random forest window spacing overlay...
initializing apriori reference subpop across CRF...
setting up random forest probability estimation arrays... done
Defining and initializing conditional random field... done
9589734 (17.3%) variant alleles 0 (0.0%) missing alleles
Generating internal simulation samples...
Internally simulated 154 samples from 1 randomly selected reference parents.
Scanning for optimal CRF Weight.... /slurmState/slurmSpool/slurmd/job775448/slurm_script: line 17: 10145 Segmentation fault (core dumped) ./rfmix -f sp1.chr1.vcf -r sp2.chr1.vcf -m sp2.pop -g sp1.genetic.map -o outer --chromosome=Chr1
my command is : ./rfmix -f sp1.chr1.vcf -r sp2.chr1.vcf -m sp2.pop -g sp1.genetic.map -o outer --chromosome=Chr1 What could this be about? = =
I switched to another dataset and now run two more rows: --- For use in scientific publications please cite original publication --- Brian Maples, Simon Gravel, Eimear E. Kenny, and Carlos D. Bustamante (2013). RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am. J. Hum. Genet. 93, 278-288
Loading genetic map for chromosome Chr1 ... done
Mapping samples ... 26 samples combined
Scanning input VCFs for common SNPs on chromosome Chr1 ... 4258431 SNPs
Loading haplotypes... done
Defining and initializing conditional random field...
setting up CRF points and random forest windows...
computing random forest window spacing overlay...
initializing apriori reference subpop across CRF...
setting up random forest probability estimation arrays... done
Defining and initializing conditional random field... done
94462316 (42.7%) variant alleles 0 (0.0%) missing alleles
Generating internal simulation samples...
Internally simulated 185 samples from 2 randomly selected reference parents.
Growing Random Forest Trees -- (851687/851687) 100.0%
Scanning for optimal CRF Weight....
Conditional random field ... 211/ 211 (100.0%) [1] 1897230 segmentation fault (core dumped) ./rfmix -f Mp.Chr1.vcf -r Ma.Chr1.vcf -m Ma.pop -g MpMa.all.genetic.map -o
I've got the same exact error, chromossomes 1-8 worked fine, but 9 and 10 didn't. Still haven't tried the rest but it's weird how it doesn't seems to be about the size of the chromossome. Aditionally, I stried upgrading the RAM to 4x the size of what worked with the chromossomes 1-8 and tried to increase and decrease the number of threads, but regardless it didn't solved it. it even seems to run a bit further than your output as it gives a few ancestries but immediatly crashes without writing any output, here's what I get:
rfmix -f 510k_hg38.vcf.gz -r RFmix/ALL.wgs.integrated_sv_map_v1_GRCh38.20130502.svs.genotypes.vcf.gz -g RFmix/chr10.modified -m RFmix/integrated_call_samples_v3.20130502.todos.panel -o maps510k/510k_hg38_chr10 --chromosome=10 --n-threads=4
RFMIX v2.03-r0 - Local Ancestry and Admixture Inference (c) 2016, 2017 Mark Koni Hamilton Wright Bustamante Lab - Stanford University School of Medicine Based on concepts developed in RFMIX v1 by Brian Keith Maples, et al.
This version is licensed for non-commercial academic research use only For commercial licensing, please contact [email protected]
--- For use in scientific publications please cite original publication --- Brian Maples, Simon Gravel, Eimear E. Kenny, and Carlos D. Bustamante (2013). RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am. J. Hum. Genet. 93, 278-288
Loading genetic map for chromosome 10 ... done
Mapping samples ... 3358 samples combined
Scanning input VCFs for common SNPs on chromosome 10 ... 52 SNPs
Loading haplotypes... done
Defining and initializing conditional random field...
setting up CRF points and random forest windows...
computing random forest window spacing overlay...
initializing apriori reference subpop across CRF...
setting up random forest probability estimation arrays... done
Defining and initializing conditional random field... done
10523 (3.0%) variant alleles 2 (0.0%) missing alleles
Generating internal simulation samples...
Internally simulated 1132 samples from 263 randomly selected reference parents.
Growing Random Forest Trees -- (11/11) 100.0%
Scanning for optimal CRF Weight....
Conditional random field ... 4490/ 4490 (100.0%)
Maximum scoring weight is 1 (-inf) Simulation results... ACB ASW BEB CDX CEU CHB CHS CLM ESN FIN GBR GIH GWD IBS ITU JPT KHV LWK MSL MXL PEL PJL PUR STU TSI YRI 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0Segmentation fault
I switched to another dataset and now run two more rows: --- For use in scientific publications please cite original publication --- Brian Maples, Simon Gravel, Eimear E. Kenny, and Carlos D. Bustamante (2013). RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am. J. Hum. Genet. 93, 278-288
Loading genetic map for chromosome Chr1 ... done Mapping samples ... 26 samples combined Scanning input VCFs for common SNPs on chromosome Chr1 ... 4258431 SNPs Loading haplotypes... done Defining and initializing conditional random field... setting up CRF points and random forest windows... computing random forest window spacing overlay... initializing apriori reference subpop across CRF... setting up random forest probability estimation arrays... done Defining and initializing conditional random field... done 94462316 (42.7%) variant alleles 0 (0.0%) missing alleles
Generating internal simulation samples... Internally simulated 185 samples from 2 randomly selected reference parents. Growing Random Forest Trees -- (851687/851687) 100.0% Scanning for optimal CRF Weight.... Conditional random field ... 211/ 211 (100.0%) [1] 1897230 segmentation fault (core dumped) ./rfmix -f Mp.Chr1.vcf -r Ma.Chr1.vcf -m Ma.pop -g MpMa.all.genetic.map -o
It is likely a memory issue. I ran into the same problem and was unable to get it work no matter how much memory I allocated. I solved it by downsizing my genetic map (I initially had genetic distance for every single locus, but rfmix will still run fine with a subset)
If that doesnt work you can also use the example dataset here as a positive control
I get this error too !
Loading genetic map for chromosome 21 ... done
Mapping samples ... 1274 samples combined
Scanning input VCFs for common SNPs on chromosome 21 ... 47 SNPs
Loading haplotypes... done
Defining and initializing conditional random field...
setting up CRF points and random forest windows...
computing random forest window spacing overlay...
initializing apriori reference subpop across CRF...
setting up random forest probability estimation arrays... done
Defining and initializing conditional random field... done
16639 (13.9%) variant alleles 0 (0.0%) missing alleles
Generating internal simulation samples...
Internally simulated 400 samples from 2 randomly selected reference parents.
Growing Random Forest Trees -- (10/10) 100.0%
Scanning for optimal CRF Weight....
Conditional random field ... 1674/ 1674 (100.0%)
Maximum scoring weight is 1 (-inf)
Simulation results...
Source1 Source2
0 1
Segmentation fault (core dumped)
All chromosomes ran fine, except 22...