recommenders icon indicating copy to clipboard operation
recommenders copied to clipboard

[FEATURE] better baseline implementation

Open Chen-Cai-OSU opened this issue 3 years ago • 3 comments

Description

Currently the baselines https://github.com/microsoft/recommenders/blob/main/examples/02_model_collaborative_filtering/baseline_deep_dive.ipynb is not very memory efficient because at cell 10 there is a step doing cross join users and items. When we have large number of users and items, current solution is not very scalable. Would you like to provide a more efficient implementation? I think it's quite important feature for new users. Thank you!

Chen-Cai-OSU avatar Aug 06 '21 15:08 Chen-Cai-OSU

I am looking for

model = topk(**kw)/random(*kw)  
model.recommend_k_items

Chen-Cai-OSU avatar Aug 06 '21 15:08 Chen-Cai-OSU

if the full set of users is too large you can segment them into groups before doing the crossjoin

gramhagen avatar Aug 06 '21 15:08 gramhagen

I think that's too much work for a new user to just establish the baseline, which is in sharp contrast to other models that is much more sophisticated but is fairly easy to use.

I have tried several models on my data and quite happy about this library. (thanks for the hard work!) The only thing I find unexpected is the lack of good baseline implementation.

Chen-Cai-OSU avatar Aug 07 '21 06:08 Chen-Cai-OSU