quepid icon indicating copy to clipboard operation
quepid copied to clipboard

Add MRR to quepid as a communal scorer.

Open david-fisher opened this issue 2 years ago • 10 comments

Description

Add reciprocal rank as a communal Scorer

Motivation and Context

Closes #523 adds a useful metric for known item search evaluation.

How Has This Been Tested?

Local install of quepid started with bin/setup_docker followed by bin/docker server

Screenshots or GIFs (if appropriate):

Screen_Recording_2022-06-03_at_3_20_10_PM_AdobeCreativeCloudExpress Screen Shot 2022-06-03 at 4 02 00 PM

Types of changes

  • [] Bug fix (non-breaking change which fixes an issue)
  • [x] Improvement (non-breaking change which improves existing functionality)
  • [x] New feature (non-breaking change which adds new functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [x] My code follows the code style of this project.
  • [] My change requires a change to the documentation.
  • [] I have updated the documentation accordingly.
  • [x] I have read the CONTRIBUTING document.
  • [] I have added tests to cover my changes.
  • [x] All new and existing tests passed.

david-fisher avatar Jun 03 '22 20:06 david-fisher

I took the ticket name out of the title, because it gets confusing that the ticket isn't the pr, if that makes sense...

epugh avatar Jun 03 '22 21:06 epugh

Is this MRR or RR? It appears to say RR in the code?

epugh avatar Jun 03 '22 21:06 epugh

Looks like this should either be [email protected] or [email protected]???

rr@10 for a single query, mrr@10 for a set of queries.

Much like AP@10 is actually MAP@10 for a collection of queries.

david-fisher avatar Jun 04 '22 16:06 david-fisher

Looks like this should either be [email protected] or [email protected]???

Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?

david-fisher avatar Jun 07 '22 13:06 david-fisher

Looks like this should either be [email protected] or [email protected]???

Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

epugh avatar Jun 07 '22 13:06 epugh

Looks like this should either be [email protected] or [email protected]???

Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.

david-fisher avatar Jun 07 '22 13:06 david-fisher

Looks like this should either be [email protected] or [email protected]???

Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.

I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?

I don't mean to be obtuse here, but the goal of Quepid is to make metrics etc simple and something I can explain to everyday users.. So I feel like if we are adding MRR to Quepid, then the file should be called mrr.js.

epugh avatar Jun 07 '22 13:06 epugh

Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?

epugh avatar Jun 07 '22 13:06 epugh

I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?

There is only one metric, reciprocal rank. When we average the reciprocal rank scores for multiple queries, the result is called mean reciprocal rank.

From the perspective of what the code is computing, [email protected] computes the reciprocal rank for a single query. The Quepid app uses that RR value to average across the set of queries in the collection to produce the MRR score for the full set.

The same is true of the computation being called AP@10, it computes a single value for a query, which is then averaged to produce what should be called MAP@10 in the Quepid display. One metric, averaged across multiple queries.

david-fisher avatar Jun 07 '22 13:06 david-fisher

Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?

AP should be named MAP, mean average precision. All of the metric names have been established by the evaluation metric community. The names used by trec_eval should be considered canonical. Note, nDCG does not get called MnDCG when averaged. P@k does not get called MP@k when averaged. Only Mean Reciprocal Rank and Mean Average Precision have the M named variant for the aggregate score across a set of queries.

david-fisher avatar Jun 07 '22 14:06 david-fisher