quepid
quepid copied to clipboard
Add MRR to quepid as a communal scorer.
Description
Add reciprocal rank as a communal Scorer
Motivation and Context
Closes #523 adds a useful metric for known item search evaluation.
How Has This Been Tested?
Local install of quepid started with bin/setup_docker followed by bin/docker server
Screenshots or GIFs (if appropriate):
Types of changes
- [] Bug fix (non-breaking change which fixes an issue)
- [x] Improvement (non-breaking change which improves existing functionality)
- [x] New feature (non-breaking change which adds new functionality)
- [] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [x] My code follows the code style of this project.
- [] My change requires a change to the documentation.
- [] I have updated the documentation accordingly.
- [x] I have read the CONTRIBUTING document.
- [] I have added tests to cover my changes.
- [x] All new and existing tests passed.
I took the ticket name out of the title, because it gets confusing that the ticket isn't the pr, if that makes sense...
Is this MRR or RR? It appears to say RR in the code?
Looks like this should either be [email protected] or [email protected]???
rr@10 for a single query, mrr@10 for a set of queries.
Much like AP@10 is actually MAP@10 for a collection of queries.
Looks like this should either be [email protected] or [email protected]???
Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?
Looks like this should either be [email protected] or [email protected]???
Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?
The title of the issue is "Add MRR", so maybe it should be "Add RR"?
Looks like this should either be [email protected] or [email protected]???
Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?
The title of the issue is "Add MRR", so maybe it should be "Add RR"?
RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.
Looks like this should either be [email protected] or [email protected]???
Did I misunderstand your change request? The new file is already named [email protected]. What are you asking for here?
The title of the issue is "Add MRR", so maybe it should be "Add RR"?
RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.
I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?
I don't mean to be obtuse here, but the goal of Quepid is to make metrics etc simple and something I can explain to everyday users.. So I feel like if we are adding MRR to Quepid, then the file should be called mrr.js.
Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?
I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?
There is only one metric, reciprocal rank. When we average the reciprocal rank scores for multiple queries, the result is called mean reciprocal rank.
From the perspective of what the code is computing, [email protected] computes the reciprocal rank for a single query. The Quepid app uses that RR value to average across the set of queries in the collection to produce the MRR score for the full set.
The same is true of the computation being called AP@10, it computes a single value for a query, which is then averaged to produce what should be called MAP@10 in the Quepid display. One metric, averaged across multiple queries.
Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?
AP should be named MAP, mean average precision. All of the metric names have been established by the evaluation metric community. The names used by trec_eval should be considered canonical. Note, nDCG does not get called MnDCG when averaged. P@k does not get called MP@k when averaged. Only Mean Reciprocal Rank and Mean Average Precision have the M named variant for the aggregate score across a set of queries.