RcppML
RcppML copied to clipboard
Custom Initializations
Thanks for the great package!
Do you think you could add a feature for RcppML::nmf where I could initialize the values of w, d, and h myself?
Thanks,
Eric.
Feature request noted, on the list for when I get back into maintaining the package.
In the meantime, you could always create an NMF model class with your own initialization, and then use the predict function to alternately update w and h until stopping criteria are satisfied.
Unless you are providing a pre-trained model of sorts as an initialization, it is rarely advisable to use a non-random initialization. If you're trying to improve on the base random initialization, I'd be curious to hear what you're doing, but I have not found any distribution or normalization algorithm that consistently improves the loss of solutions over what is currently in place.
-Zach
Thanks Zach!
That feature would be great.
My interest here has less to do with achieving the smallest loss and more to do with understanding / manipulating how different initializations change the final representation. Because the NMF model is non-identifiable, in general I've found that different initializations can lead to very different representations, all with similar losses.
I'd be happy to contribute to this change. It seems like initializing w myself is easy. Where does the initialization of h happen?
@eweine h does not need initialization. When w is randomly initialized, we solve for h given A and w. Methods that do initialize w and h but use alternating least squares updates would only use h as a point of initialization for the non-negative least squares coordinate descent solver, but the reality is that almost identical solutions are achieved from a zero-filled h vs a randomly filled h.
If you wish to initialize h instead of w, just transpose the inputs and now h becomes w and w becomes h. Then transpose the final model.
Hope that simplifies things!
Best, Zach