hic
hic copied to clipboard
get_valid_interaction fails with exitcode 137 with --digestion arima
There isn't much context provided with the following error
$ nextflow run main.nf \
-profile docker \
--genome GRCh38 \
--digestion arima \
--input '/home/ec2-user/*{1,2}.fastq.gz'
...
Error executing process > 'get_valid_interaction (sample)'
Caused by:
Process `get_valid_interaction (sample)` terminated with an error exit status (137)
Command executed:
mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs
Command exit status:
137
Command output:
(empty)
Command error:
.command.sh: line 2:
27 Killed mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
...
$ cat .exitcode
137
$ cat .command.err
/home/ec2-user/hic/work/19/3e1cb12b33b4f95171a7c8141e2566/.command.sh: line 2:
27 Killed mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
Might you happen to know of any test datasets that use the ARIMA protocol, so that I may create a reproducible case?
Check Documentation
I have checked the following places for your error:
Description of the bug
Steps to reproduce
Steps to reproduce the behaviour:
- Command line:
- See error:
Expected behaviour
Log files
Have you provided the following extra information/files:
- [ ] The command used to run the pipeline
- [ ] The
.nextflow.log
file
System
- Hardware:
- Executor:
- OS:
- Version
Nextflow Installation
- Version:
Container engine
- Engine:
- version:
- Image tag:
Additional context
Previous error was with 100k and 1 million reads, with 10 million reads I get the same error code and then on retry
Error executing process > 'get_valid_interaction (sample)'
Caused by:
Process `get_valid_interaction (sample)` terminated with an error exit status (1)
Command executed:
mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs
Command exit status:
1
Command output:
(empty)
Command error:
Traceback (most recent call last):
File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 566, in <module>
resFrag = timing(load_restriction_fragment, fragmentFile, minFragSize, maxFragSize, verbose)
File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 68, in timing
result = function(*args)
File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 206, in load_restriction_fragment
for line in bed_handle:
OSError: [Errno 12] Cannot allocate memory
.command.run: fork: Cannot allocate memory
.command.sh: line 2: 27 Killed mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
.command.run: line 155: kill: (25) - No such process
Work dir:
/home/ec2-user/hic/work/4c/decc189d16592544d8c91d6615afa7
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
Hi,
Thanks for the report.
The get_valid_interaction
process is configured with 4Go of RAM.
And in practice, only the restriction fragments file is loaded into memory.
According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ? Otherwise, what is the size of the reference fragment file ? Thanks
According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ?
This was running in Nextflow local mode on an EC2 instance with 32G RAM.
Otherwise, what is the size of the reference fragment file ?
Didn't catch the size of this file; these runs were using the iGenomes GRCh38 reference.
I'll try running on AWS Batch next and see if increasing the process requirements for get_valid_interaction helps.
I could not get it running in Tower with the default (4G RAM) or with 16G RAM
withName:get_valid_interaction {
memory = 16.GB
}
nor with hg19
instead of GRCh38
as the reference.
Both -profile test,docker
and -profile test_full,docker
run fine for me both in local mode and on Tower, so I'm thinking there is either an issue with arima
digestion or with the input reads I'm trying to use.
Might you happen to know of any test datasets that use the ARIMA protocol?
None of the test uses the ARIMA protocol ... but the only difference would be the digestion and the resolution/numbers of restriction fragments. I can try to do more test on my side too. Would you have any public Hi-C dataset using ARIMA kits in mind ?