libmf icon indicating copy to clipboard operation
libmf copied to clipboard

NaN Values during training

Open AnnSeidel opened this issue 6 years ago • 2 comments

I am currently trying to factorize a matrix using the MF_Solver with the KL Loss function. I experience NaN values during my training either after the 1st or the 2nd iteration. After creating small test cases, I have found that there might be an issue with large gradients causing negative values which are then converted to 0's in the sg_update function. This however seems to create 0 rows/columns in either my P or Q matrix which are not dealt with, as the prepare_for_sg_update function will calculate z=0, which results in NaN values which are carried through the calculations and eventually result in the whole model being filled with NaN values.

Can the algorithm (when calculating 1/z, one might consider 1/(z+epsilon) with epsilon>0) or my parameters (especially the learning rate) be adjusted to handle such cases?

Do you have some additional idea what might be causing NaN values during training?

AnnSeidel avatar Jan 07 '19 21:01 AnnSeidel

I also meat this problem. As you say, it's because the learning rate is to big. Using a small learning rate will solve this problem.

jia-zhuang avatar Feb 08 '19 03:02 jia-zhuang

The program will also fail if you give any zero-valued entries in the training set.

zclandry avatar Oct 23 '19 09:10 zclandry