hic icon indicating copy to clipboard operation
hic copied to clipboard

get_valid_interaction fails with exitcode 137 with --digestion arima

Open heuermh opened this issue 3 years ago • 6 comments

There isn't much context provided with the following error

$ nextflow run main.nf \
  -profile docker \
  --genome GRCh38 \
  --digestion arima \
  --input '/home/ec2-user/*{1,2}.fastq.gz'

...

Error executing process > 'get_valid_interaction (sample)'

Caused by:
  Process `get_valid_interaction (sample)` terminated with an error exit status (137)

Command executed:

  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs

Command exit status:
  137

Command output:
  (empty)

Command error:
  .command.sh: line 2:
    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all

...

$ cat .exitcode
137

$ cat .command.err
/home/ec2-user/hic/work/19/3e1cb12b33b4f95171a7c8141e2566/.command.sh: line 2:
    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all

Might you happen to know of any test datasets that use the ARIMA protocol, so that I may create a reproducible case?

Check Documentation

I have checked the following places for your error:

Description of the bug

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line:
  2. See error:

Expected behaviour

Log files

Have you provided the following extra information/files:

  • [ ] The command used to run the pipeline
  • [ ] The .nextflow.log file

System

  • Hardware:
  • Executor:
  • OS:
  • Version

Nextflow Installation

  • Version:

Container engine

  • Engine:
  • version:
  • Image tag:

Additional context

heuermh avatar Nov 10 '21 22:11 heuermh

Previous error was with 100k and 1 million reads, with 10 million reads I get the same error code and then on retry

Error executing process > 'get_valid_interaction (sample)'

Caused by:
  Process `get_valid_interaction (sample)` terminated with an error exit status (1)

Command executed:

  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 566, in <module>
      resFrag = timing(load_restriction_fragment, fragmentFile, minFragSize, maxFragSize, verbose)
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 68, in timing
      result = function(*args)
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 206, in load_restriction_fragment
      for line in bed_handle:
  OSError: [Errno 12] Cannot allocate memory
  .command.run: fork: Cannot allocate memory
  .command.sh: line 2:    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  .command.run: line 155: kill: (25) - No such process

Work dir:
  /home/ec2-user/hic/work/4c/decc189d16592544d8c91d6615afa7

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

heuermh avatar Nov 11 '21 00:11 heuermh

Hi, Thanks for the report. The get_valid_interaction process is configured with 4Go of RAM. And in practice, only the restriction fragments file is loaded into memory.

According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ? Otherwise, what is the size of the reference fragment file ? Thanks

nservant avatar Nov 11 '21 19:11 nservant

According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ?

This was running in Nextflow local mode on an EC2 instance with 32G RAM.

Otherwise, what is the size of the reference fragment file ?

Didn't catch the size of this file; these runs were using the iGenomes GRCh38 reference.

I'll try running on AWS Batch next and see if increasing the process requirements for get_valid_interaction helps.

heuermh avatar Nov 22 '21 14:11 heuermh

I could not get it running in Tower with the default (4G RAM) or with 16G RAM

   withName:get_valid_interaction {
      memory = 16.GB
   }

nor with hg19 instead of GRCh38 as the reference.

heuermh avatar Nov 22 '21 23:11 heuermh

Both -profile test,docker and -profile test_full,docker run fine for me both in local mode and on Tower, so I'm thinking there is either an issue with arima digestion or with the input reads I'm trying to use.

Might you happen to know of any test datasets that use the ARIMA protocol?

heuermh avatar Nov 23 '21 04:11 heuermh

None of the test uses the ARIMA protocol ... but the only difference would be the digestion and the resolution/numbers of restriction fragments. I can try to do more test on my side too. Would you have any public Hi-C dataset using ARIMA kits in mind ?

nservant avatar Nov 23 '21 12:11 nservant