spotlight icon indicating copy to clipboard operation
spotlight copied to clipboard

userid and itemid start from 1

Open KylinA1 opened this issue 5 years ago • 3 comments

Hello,

I just notice that you start count id from 1, which lead to one more dimension abuse in both user and item. For example, the number of user and item in Movielens 1M is 6040 and 3706. Actually, your final processed dataset , including Scipy matrix is in 6041*3707 shape.

This might be a tiny problem.

KylinA1 avatar Jun 12 '19 19:06 KylinA1

For sequence models, 0 item id is reserve as padding. That's why.

snemistry avatar Sep 06 '19 05:09 snemistry

Thanks fur your kind replies, that make sense.

KylinA1 avatar Sep 06 '19 14:09 KylinA1

It seems a bit weird that user ids are implicitly assumed to start from 1 until N, with N = num users since 0 is reserved for padding but it raises an error if num_users = user_ids.max(). Same goes for item ids. Or have I missed something?

See spotlight/interactions.py at line 129: if self.user_ids.max() >= self.num_users: raise ValueError('Maximum user id greater ' 'than declared number of users.')

BBiering avatar Jul 20 '20 07:07 BBiering