mem_absa icon indicating copy to clipboard operation
mem_absa copied to clipboard

Crash during training

Open jurukode opened this issue 7 years ago • 5 comments

Hi @ganeshjawahar,

i got experienced that the script always got killed after seven iterations automatically. Not sure what's happening. do you have any idea why?

error

jurukode avatar May 10 '17 09:05 jurukode

So in my experience, there is a resource exhaustion error and thats why the process is killed. There seems to be a bug in this code that uses a lot more GPU RAM than the original MemN2N, but I'm not sure where this bug is.

joeybose avatar May 10 '17 15:05 joeybose

Hi @0220joey,

yeah, by the way i'm only using CPU right now and the training process eat so many RAM until my laptop got hang. Thanks for the info by the way!

jurukode avatar May 12 '17 02:05 jurukode

I've managed to fix the issue, its in all the tf.assign which adds more nodes to the graph. So comment out lines 156-170, 211-225 in model.py. Although, the accuracy is about 10% less than the paper.

joeybose avatar May 12 '17 02:05 joeybose

Thanks @0220joey,

it works well and run faster. Wondering if the commented code are vital to increase accuracy or not

jurukode avatar May 12 '17 02:05 jurukode

I dont think so, but the paper does a few more tricks that this code doesnt do but I cant be too sure if thats the cause of the accuracy increase.

joeybose avatar May 12 '17 15:05 joeybose