hap.py
hap.py copied to clipboard
std::bad_alloc on small-size reference and VCFs
hap.py crashes after 10 hours of computation, presumably because it tries to allocate too much memory in RAM: peak RAM usage at termination is 117 GB on a system with 128 GB (the manual says that the tool never exceeds 64 GB: https://github.com/Illumina/hap.py#hardware) and the error is a std::bad_alloc. The reference is Human Chr1 (about 250 MB), and the two input VCFs take just 14 and 8 MB. No input format errors are reported, nor any informative error that might hint what the source of the problem is. This is the traceback (including /usr/bin/time's output):
[W] overlapping records at 1:1869284 for sample 0
[W] Variants that overlap on the reference allele: 73
[I] Total VCF records: 192326
[I] Non-reference VCF records: 192326
[W] overlapping records at 1:1165320 for sample 0
[W] Variants that overlap on the reference allele: 283
[I] Total VCF records: 341672
[I] Non-reference VCF records: 341672
Hap.py v0.3.12
2020-01-31 02:27:40,810 WARNING terminate called after throwing an instance of 'std::bad_alloc'
2020-01-31 02:27:40,814 WARNING what(): std::bad_alloc
2020-01-31 02:27:40,814 WARNING Aborted (core dumped)
2020-01-31 02:27:40,814 ERROR Exception when running <function xcmpWrapper at 0x7fcb42329a28>:
2020-01-31 02:27:40,814 ERROR ------------------------------------------------------------
2020-01-31 02:27:40,814 ERROR Traceback (most recent call last):
2020-01-31 02:27:40,814 ERROR File "/biodata/Nicola/workspace/hap.py/build/lib/python27/Tools/parallel.py", line 71, in parMapper
2020-01-31 02:27:40,823 ERROR return arg[1]['fun'](arg[0], *arg[1]['args'], **arg[1]['kwargs'])
2020-01-31 02:27:40,823 ERROR File "/biodata/Nicola/workspace/hap.py/build/lib/python27/Haplo/xcmp.py", line 69, in xcmpWrapper
2020-01-31 02:27:40,825 ERROR subprocess.check_call(to_run, shell=True, stdout=tfo, stderr=tfe)
2020-01-31 02:27:40,825 ERROR File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
2020-01-31 02:27:40,826 ERROR raise CalledProcessError(retcode, cmd)
2020-01-31 02:27:40,827 ERROR CalledProcessError: Command 'xcmp /tmp/truth.ppnMtGZk.vcf.gz /tmp/query.ppSFz72W.vcf.gz -l 1:115118971-142561441 -o /tmp/result.1:115118971-142561441iBrR7q.bcf -r ../REF/chr1.fasta -f 0 -n 16768 --expand-hapblocks 30 --window 50 --no-hapcmp 0 --qq QUAL' returned non-zero exit status 134
2020-01-31 02:27:40,827 ERROR ------------------------------------------------------------
2020-01-31 02:27:40,834 ERROR One of the xcmp jobs failed.
2020-01-31 02:27:40,836 ERROR Traceback (most recent call last):
2020-01-31 02:27:40,837 ERROR File "/biodata/Nicola/workspace/hap.py/build/bin/hap.py", line 529, in