vowpal_wabbit
vowpal_wabbit copied to clipboard
Make lrq O(k n) instead of O(k n^2)
Short description
In order to compute a gradient, the current implementation of lrq / factorization machines does a triply nested loop, twice over n non-zero features, and once over k latent variables. But it's easy to reorder the computation so that it's O(k n) instead of O(k n^2). See 57:7 in https://www.csie.ntu.edu.tw/~b97053/paper/Factorization%20Machines%20with%20libFM.pdf.
How this suggestion will help you/others
It implies a considerable gain in performance for any but the smallest n.
Possible solution/implementation details
When I said "it's easy to reorder the computation" I meant it's easy in theory. It's possible that VW is organized in a way that is hardly compatible with the alternative computation.
Example/links if any
https://github.com/srendle/libfm/blob/30b9c799c41d043f31565cbf827bf41d0dc3e2ab/src/libfm/src/fm_learn_sgd_element_adapt_reg.h#L201
FYI, libFM is under GPL 3.0. We cannot use their code in VW.