ray icon indicating copy to clipboard operation
ray copied to clipboard

seed registration fails at -k 51 but not -k 49

Open sebhtml opened this issue 11 years ago • 7 comments

Original message:

-------- Original Message -------- Subject: Re: [Denovoassembler-users] largest kmer-value? Date: Sun, 17 Nov 2013 09:14:54 -0500 From: Hornung, Bastian [email protected] To: [email protected] [email protected]

Hi Sebastien,

the error message is just a segmentation fault, output below:

Rank 0 registered 0/14207 Rank 0 registered 1000/14207 Rank 0 registered 2000/14207 Rank 0 registered 3000/14207 Rank 0 registered 4000/14207 Rank 0 registered 5000/14207 Rank 0 registered 6000/14207 Rank 0 registered 7000/14207 Rank 0 registered 8000/14207 Rank 0 registered 9000/14207 Rank 0 registered 10000/14207 Rank 0 registered 11000/14207 Rank 0 registered 12000/14207 Rank 0 registered 13000/14207 Rank 0 registered 14000/14207 [ssb3:10762] *** Process received signal *** [ssb3:10762] Signal: Segmentation fault (11) [ssb3:10762] Signal code: Address not mapped (1) [ssb3:10762] Failing at address: (nil) Rank 0 registered 14206/14207 Rank 0 registered its seeds [ssb3:10762] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7fbd8d667cb0] [ssb3:10762] [ 1] Ray(_ZN21SeedFilteringWorkflow14finalizeMethodEv+0x221) [0x56ef11] [ssb3:10762] [ 2] Ray(_ZN11TaskCreator8mainLoopEv+0xbe) [0x5cc3fe] [ssb3:10762] [ 3] Ray(_ZN11ComputeCore15runWithProfilerEv+0x382) [0x5cfcd2] [ssb3:10762] [ 4] Ray(_ZN11ComputeCore3runEv+0xbc) [0x5d3b8c] [ssb3:10762] [ 5] Ray(_ZN7Machine5startEv+0x19a6) [0x479416] [ssb3:10762] [ 6] Ray(_ZN11RankProcessI7MachineE3runEv+0x9f) [0x47699f] [ssb3:10762] [ 7] Ray(main+0xc7) [0x4724d7] [ssb3:10762] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fbd8d2b976d] [ssb3:10762] [ 9] Ray() [0x473f01] [ssb3:10762] *** End of error message ***

(only Rank 0 because it was only running on 1 core) I first thought it could be a memory or general power problem (due to other problems with our hardware, which have been resolved in the meantime), but it now fails to run on my local machine as well as on our server, which has 256 GB of RAM. I guess that should be more than enough for a 5mb prokaryotic genome. The cutoff seems to be at 49, which still works, but it begins to crash at 51, and I have no clue why. If anyone has any idea, then I'd be very happy to hear it.

Best regards,

Bastian

sebhtml avatar Nov 21 '13 14:11 sebhtml

http://article.gmane.org/gmane.science.biology.ray-genome-assembler/689

sebhtml avatar Nov 21 '13 15:11 sebhtml

see https://www.mail-archive.com/[email protected]/msg00716.html

sebhtml avatar Jan 06 '14 19:01 sebhtml

see http://permalink.gmane.org/gmane.science.biology.ray-genome-assembler/701

sebhtml avatar Jan 06 '14 19:01 sebhtml

gmane thread:

http://thread.gmane.org/gmane.science.biology.ray-genome-assembler/685/focus=689

sebhtml avatar Jan 06 '14 22:01 sebhtml

I have sent a message to the end user because I could nott reproduce the issue.

sebhtml avatar Jan 06 '14 22:01 sebhtml

The problem seems to depend on the number of cores too:

http://thread.gmane.org/gmane.science.biology.ray-genome-assembler/685/focus=689

sebhtml avatar Jan 09 '14 18:01 sebhtml

this seems to be related to checkpointing. Vertex.cpp 176 is called in MessageProcessor.cpp (TAG_START_SEEDING ...)

sebhtml avatar Jan 09 '14 21:01 sebhtml