NNLM icon indicating copy to clipboard operation
NNLM copied to clipboard

Theoretical and practical issues with regularization -- not convex. Scaling required

Open zdebruine opened this issue 2 years ago • 0 comments

See https://github.com/zdebruine/RcppML/issues/29. As Kim and Park note in their 2004 manuscript (cited in the NNLM BMC Bioinformatics manuscript), it is necessary to normalize the factors in the model (by columns in w and rows in h) to ensure the penalties act equally on all factors. Without scaling, factors that contribute more to the model are preserved while weights in the other factors are nearly or entirely dropped to zero. This causes instability.

L2 regularization in NNLM causes different solutions to be discovered from different random restarts. If you introduce scaling, Kim and Park note that factors will be coerced towards the dense case equivalent to the second singular vector. I have also seen this: https://www.zachdebruine.com/post/l2-regularized-nmf/.

The Pattern Extraction normalization is also not convex, and can cause model instability as fitting progresses. Note that factor scaling also does not stabilize this regularization. This is because a must have diagonal values larger than the marginal sums of corresponding off-diagonal values. A further case is that negative values are introduced into a with an overly strong penalty of PE. This is not permissible in NNLS. Either subtract from the diagonal of a (L2) or shrink a towards zero (yet undescribed). Neither is particularly useful in NMF due to the alternating updates. Effective regularization of least squares is very different than effective regularization of NMF. There is cited literature for these penalties, but this literature also fails to demonstrate convexity across iterations of NMF.

Would also be nice to get the package back on CRAN.

zdebruine avatar Mar 14 '22 14:03 zdebruine