libmf NaN Values during training

NaN Values during training

Open AnnSeidel opened this issue 6 years ago • 2 comments

I am currently trying to factorize a matrix using the MF_Solver with the KL Loss function. I experience NaN values during my training either after the 1st or the 2nd iteration. After creating small test cases, I have found that there might be an issue with large gradients causing negative values which are then converted to 0's in the sg_update function. This however seems to create 0 rows/columns in either my P or Q matrix which are not dealt with, as the prepare_for_sg_update function will calculate z=0, which results in NaN values which are carried through the calculations and eventually result in the whole model being filled with NaN values.

Can the algorithm (when calculating 1/z, one might consider 1/(z+epsilon) with epsilon>0) or my parameters (especially the learning rate) be adjusted to handle such cases?

Do you have some additional idea what might be causing NaN values during training?

Jan 07 '19 21:01 AnnSeidel

I also meat this problem. As you say, it's because the learning rate is to big. Using a small learning rate will solve this problem.

Feb 08 '19 03:02 jia-zhuang

The program will also fail if you give any zero-valued entries in the training set.

Oct 23 '19 09:10 zclandry

libmf libmf copied to clipboard

NaN Values during training

libmf
libmf copied to clipboard