clstm icon indicating copy to clipboard operation
clstm copied to clipboard

Many NaN errors with the new Tensor-based version

Open ASDen opened this issue 9 years ago • 2 comments

For many models (especially deep ones with many parameters e.g. bidi2), I keep getting the following error

clstm.cc:664: void ocropus::GenericNPLSTM<F, G, H>::backward() [with int F = 1; int G = 2; int H = 2]: Assertion `!anynan(out)' failed.

where the old version (Mat-based) works just fine

ASDen avatar Feb 13 '16 12:02 ASDen

can you please confirm the problem ? or it is just me misusing CLSTM...

ASDen avatar Feb 19 '16 06:02 ASDen

Hi, I believe it's just wrong assert. The assert is after input assignment, so ".d" derivatives parts are still un-initialized. (for larger networks you just increase probability that there will be random nan).

the proper fix can be:

bool anynan(Batch &a) { if(anynan(a.v())) return true; if(anynan(a.d())) return true; //this is failing return false; }

bool anynan_v(Batch &a) { if(anynan(a.v())) return true; return false; }

and replace anynan with anynan_v during the forward step.

MichalBusta avatar Feb 24 '16 08:02 MichalBusta