vowpal_wabbit icon indicating copy to clipboard operation
vowpal_wabbit copied to clipboard

Make lrq O(k n) instead of O(k n^2)

Open memeplex opened this issue 4 years ago • 1 comments
trafficstars

Short description

In order to compute a gradient, the current implementation of lrq / factorization machines does a triply nested loop, twice over n non-zero features, and once over k latent variables. But it's easy to reorder the computation so that it's O(k n) instead of O(k n^2). See 57:7 in https://www.csie.ntu.edu.tw/~b97053/paper/Factorization%20Machines%20with%20libFM.pdf.

How this suggestion will help you/others

It implies a considerable gain in performance for any but the smallest n.

Possible solution/implementation details

When I said "it's easy to reorder the computation" I meant it's easy in theory. It's possible that VW is organized in a way that is hardly compatible with the alternative computation.

Example/links if any

https://github.com/srendle/libfm/blob/30b9c799c41d043f31565cbf827bf41d0dc3e2ab/src/libfm/src/fm_learn_sgd_element_adapt_reg.h#L201

memeplex avatar Aug 13 '21 16:08 memeplex

FYI, libFM is under GPL 3.0. We cannot use their code in VW.

lokitoth avatar Aug 20 '21 12:08 lokitoth