Minimac4
Minimac4 copied to clipboard
Multithreading during M3VCF/MSAV Generation?
I am trying to generate a custom reference with Minimac3 (M3VCF) and Minimac4 (MSAV) and was wondering if the operations to do so can be enabled to be/are possibly multithreaded?
Commands to Generate Reference Files
# Minimac3
Minimac3 --refHaps chr${chr}.vcf.gz --processReference --prefix m3vcfs/chr${chr} --myChromosome {chr_prefix} --rsid
# Minimac4
minimac4 --compress-reference reference.{sav,bcf,vcf.gz} > reference.msav
When I try using the --cpus
flag, it doesn't seem like the CPUs I have available are being used when I'm checking on things with htop
...
Multithread option takes effect for imputation only.
On Fri, Jul 28, 2023 at 4:06 PM Michelle Franc Ragsac, Ph.D. < @.***> wrote:
I am trying to generate a custom reference with Minimac3 (M3VCF) and Minimac4 (MSAV) and was wondering if the operations to do so can be enabled to be/are possibly multithreaded?
Commands to Generate Reference Files
Minimac3
Minimac3 --refHaps chr${chr}.vcf.gz --processReference --prefix m3vcfs/chr${chr} --myChromosome {chr_prefix} --rsid
Minimac4
minimac4 --compress-reference reference.{sav,bcf,vcf.gz} > reference.msav
When I try using the --cpus flag, it doesn't seem like the CPUs I have available are being used when I'm checking on things with htop...
— Reply to this email directly, view it on GitHub https://github.com/statgen/Minimac4/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6UVLILH4PH4BE7YP74QJDXSQLTFANCNFSM6AAAAAA235XCDY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Ketian Yu, M.S. PhD candidate | Department of Biostatistics University of Michigan, Ann Arbor MI She | Her | Hers
Thank you for your speedy reply!!
Is it expected that a single chromosome from an imputation panel would take multiple days to compress to the M3VCF or MSAV format? I'm trying to understand if there are issues on my end in running things or if this is expected behavior ...
Yes, It can take a long time for large reference panels. With Minimac4, you can speed up the compression by using multiple processes (instead of threads) and then concatenating the chunks:
bcftools view chr1.vcf.gz -Ou -r chr1:1-10000000 -i 'POS>=1' | minimac4 --compress-reference /dev/stdin > chr1_1_10000000.msav
bcftools view chr1.vcf.gz -Ou -r chr1:10000001-20000000 -i 'POS>=10000001' | minimac4 --compress-reference /dev/stdin > chr1_10000001_20000000.msav
...
sav concat $( ls chr1_*.msav | sort -V ) -o chr1.msav
I don't know for sure whether this approach is possible for minimac3.
bcftools: https://github.com/samtools/bcftools sav: https://github.com/statgen/savvy/releases/download/v2.1.0/savvy-2.1.0-Linux-x86_64-cli.sh