Minimac4
Minimac4 copied to clipboard
Reference Panel download
Hi Minimac4 Team,
I am currently facing challenges while attempting to run "minimac4" to impute my genotype files on our High-Performance Cluster. I have encountered a couple of issues that I believe may require your expertise to resolve.
Reference Panel Download: I attempted to download the reference panel from the provided link on the Minimac4 wiki page .. However, despite connecting to the FTP server, the download does not initiate. I would appreciate guidance on the correct procedure or any potential troubleshooting steps.
Target Study VCF File: I want to confirm if the "targetStudy.vcf" file mentioned in the documentation refers to the VCF file generated from Plink files using the following commands from :
plink --bfile YOURFILE --keep-allele-order --freq --out YOURFILE.output --allow-no-sex
plink --bfile YOURFILE --recode vcf --out YOURFILE.output_file --keep-allele-order
vcf-sort YOURFILE.output_file.vcf | bgzip -c > pre_impute_YOURFILE.vcf.gz
Is this the correct process for creating the target VCF file for Minimac4 imputation?
Imputation Code: For imputation, I am using the following code:
minimac4 --refHaps refPanel.m3vcf \
--haps targetStudy.vcf \
--prefix testRun \
--cpus 5
Is "refPanel.m3vcf" downloadable from the link above? and Can I substitute "targetStudy.vcf" with "pre_impute_YOURFILE.vcf.gz" in this command?
Your assistance in resolving these issues would be immensely valuable, and I appreciate your time and support in advance.
Thank you!
That wiki is legacy documentation. Use the readme in this repo instead.
You can use the https protocol instead of ftp to download reference panel: https://share.sph.umich.edu/minimac4/panels/.
You must index your target VCF as weill (tabix -p vcf pre_impute_YOURFILE.vcf.gz
).
The minimac4 command you reference will work with the correct reference panel, but is deprecated. See readme for commands to use with latest version. You will be using and *.msav
reference panel instead of and *.m3vcf.gz
.
Thanks for getting back! I was able to download the reference panels.
I am now following the command lines exactly as outlined in the readme
file:
minimac4 1000g_phase3_v5.chr14.with_parameter_estimates.msav pre_impute_YOURFILE.vcf.gz > imputed_YOURFILE.sav
However, it seems to be disregarding the parameters, and I'm receiving the following warnings:
WARNING -
Problems encountered parsing command line:
Command line parameter 1000g_phase3_v5.chr14.with_parameter_estimates.msav (#1) ignored
Command line parameter pre_impute_upenn_ucla_mssm_impute_chr14.output_file.vcf.gz (#2) ignored
The same issue persists when using the command:
minimac4 1000g_phase3_v5.chr14.with_parameter_estimates.msav pre_impute_upenn_ucla_mssm_impute_chr14.output_file.vcf.gz -o imputed.vcf.gz
Am I required to include additional flags, or is there something else I might be overlooking?
I think you are using an old version of minimac4. See the latest at https://github.com/statgen/Minimac4/releases.
Thanks! I was waiting for our cluster to update the module. It works now!! I also have a last question: Does minimac4 provide QC results and information on excluded SNPs similar to that of the Imputation server?
No, the Imputation Server uses it's own routines for the QC preprocessing step (which includes variant and chunk exclusion). The only metrics that Minimac4 will provide are in the INFO fields of the imputed results (R2, ER2, AVG_CS). You can get a sites-only version of the results with the --sites
option, which produces a VCF with these INFO fields but no genotype data. This file is also generated automatically when using the --prefix
option.
Awesome! Thank you.
Hi Jonathon,
I've updated my files to the hg38 build recently. The link you provided before (https://share.sph.umich.edu/minimac4/panels/) has reference files for 1000g_phase3_v5, which is for hg19. Can you guide me to the reference panel for hg38, specifically the one for 1000 Genomes Phase 3 (Hg38)?
Thank you!
We do not yet host a b38 panel. You would have to generate one on your own using a phased 1000g call set with minimac4 --compress-reference
.