recommenderlab icon indicating copy to clipboard operation
recommenderlab copied to clipboard

Extensions for sampling "known"/"unknown" recommendations in test set

Open gregreich opened this issue 3 years ago • 3 comments

The feature to supply a vector of top-N-list lengths is very nice. Not so much for efficiency but certainly for experimental consistency, it would be great to have a similar feature for the "given-x" (or "all-but-x") parameter x aka given. One could easily guarantee that (i) the test users are the same for all x, and (ii) that all recommendations for xi are considered also when using xi+1 (with xi+1 > xi), by drawing all indices at once for the highest x, and then just subset them. In particular, this would give a faster converging estimate of the difference between the parametrizations, I presume.

Plotting could then be extended to feature also the x dimension: in the present setup, colours together with marker types define the method, but line types are the same for all curves. One could add this dimension through different line types within each method (i.e., within each line color) to visually argue that, for example, some methods provide more accurate forecasts than others - even when using less information on the users.

Moreover, I think it would be great if the "known"/"unknown" recommendations could be assigned by the user (or, even better, the distribution from which it is drawn). For example, this can be used to simulate a situation where some items are usually "consumed" very early in history, and thus should enter the "known" sample more often than they would appear following a uniform distribution. (In particular, I conjecture that the performance of RECOM_POPULAR is overestimated under uniform sampling in such a situation.) One idea to implement this would be quite simple through a realRatingMatrix with the order of the elements to appear.

Any thoughts, ideas, or warnings?

gregreich avatar Aug 30 '21 17:08 gregreich

Can you add code examples of how to use the mentioned vector for top N-lists and then examples of how you would want to call the other functions for given-x and for creating the "known"/"unknown" recommendations. This will give me a better idea of what you would like.

mhahsler avatar Sep 04 '21 13:09 mhahsler

Hi Michael, I have to apologize - I think I was confusing here. Moreover, for now my goal is more to start a discussion about the feasibility and the meaningfulness of my aim; I'm happy to contribute later if it turns out to make sense.

In effect, I want to achieve two technically unrelated things, but both concern the sampling for recommender evaluation:

First, I want to be able to use the evaluationScheme something like evaluationScheme(..., given = c(5,10,20)). Here is what should happen in the background: When creating the known portion for given=5, the remaining max(c(5,10,20))-5 = 15 recommendations should be withheld from unknown. If not - e.g., when running evaluationScheme() in loop over the different levels given - the unknown set is not the same over the different runs. For many applications, this will not change much. But if the number of recommendations is small compared to the number of items, it will induce a significant distortion and create results that simply don't make sense (most probably because the denominator in the TPR becomes very small). BTW: I'm aware that the suggested syntax is in conflict with providing an individual given parameter per observation; this is just to lay out the idea.

Second, I think one should be able to supply the distribution which is used to sample into known vs. unknown. This could be achieved in a similar manner as it is already an option in getTopNLists(..., randomise=TRUE) by providing the distribution as a realRatingMatrix.

Does this make sense?

gregreich avatar Sep 04 '21 13:09 gregreich

These things make sense. I will label this issue as an enhancement.

mhahsler avatar Sep 06 '21 13:09 mhahsler