implicit How to do incremental training?

How to do incremental training?

Open tfzxyinhao opened this issue 7 years ago • 8 comments

generator new data and new user every day,how to do incremental training

May 26 '17 14:05 tfzxyinhao

You have to maintain your own item/user matrix and manage new/expired users or items. Then run matrix factorization periodically.

May 26 '17 15:05 leodesigner

@leodesigner thanks for your answer matrix factorization once again waste time your mean that can't reuse the last time result of matrix factorization

May 27 '17 02:05 tfzxyinhao

You can actually reuse last results as an initialization - this is better than starting from random initialization. In this case only new item/users should be initialized randomly or as an average of similar items/users. In my case my input matrix is about 10000x10000 and one iteration of the MF algorithm takes about 0.0125s on my server. (I am reusing last results).

May 27 '17 07:05 leodesigner

@leodesigner are you use implicit to do real-time recommend? or it's can do it

May 27 '17 09:05 tfzxyinhao

I am using implicit to do calculations once a 10-60 minutes. However the results are used in realtime web app (client browser side item sorting based on cousine distance).

May 27 '17 10:05 leodesigner

Like @leodesigner was saying - this isn't supported in this library right now, but you can build this on top of implicit with some effort.

Adding support for incremental training would be a good feature for this library.

May 28 '17 19:05 benfred

If you update the user_items matrix, you can now recalculate a user factor and get updated recommendations by running model.recommend(userid, user_items, recalculate_user=True).

You can also get recommendations for new users by passing a column vector as user_items and userid=0.

It's still not incremental training because item factors are not recalculated, but maybe it's helpful.

Jun 23 '17 11:06 jbochi

Setting recalculate_user=True works, but seems to be quite slow, since it does a complete matrix inversion. In my tests, over 99% of the time would be spent on this operation.

One could speed this up by the conjugate gradient method, detailed here: https://www.benfrederickson.com/fast-implicit-matrix-factorization/.

The relevant part of that page would give something like this.

def factor_user_cg(Cui, X, Y, regularization, cg_steps=3):
	users, factors = X.shape
	# we could cache this
	YtY = Y.T.dot(Y) + regularization * np.eye(factors)

	# random start
	x = np.random.rand(factors) * 0.01

	# calculate residual r = (YtCuPu - (YtCuY.dot(Xu), without computing YtCuY
	r = -YtY.dot(x)
	for i, confidence in nonzeros(Cui, u):
		r += (confidence - (confidence - 1) * Y[i].dot(x)) * Y[i]

	p = r.copy()
	rsold = r.dot(r)

	for _ in range(cg_steps):
		# calculate Ap = YtCuYp - without actually calculating YtCuY
		Ap = YtY.dot(p)
		for i, confidence in nonzeros(Cui, u):
		Ap += (confidence - 1) * Y[i].dot(p) * Y[i]

		# standard CG update
		alpha = rsold / p.dot(Ap)
		x += alpha * p
		r -= alpha * Ap
		rsnew = r.dot(r)
		p = r + (rsnew / rsold) * p
		rsold = rsnew

	return x

Mar 29 '18 14:03 marcusklaas

implicit implicit copied to clipboard

How to do incremental training?

implicit
implicit copied to clipboard