Change how we handle SemMedDB edges in ranking

Open finnagin opened this issue 4 years ago • 2 comments

SemMedDB seems to be returning lots of odd edges and bad results that get pushed higher in rankings.

Lots of potential options to address this:

[x] Use subject and object confidence scores in ranking
[ ] Use subject and object novelty in ranking
[x] Condense SemMedDB edges into one edge
[ ] SemMedDB antonym handling

Oct 06 '21 20:10 finnagin

Should look into averaging semeddb edge publication counts using:

harmonic mean
geometric mean
median
arithmetic mean
L-infinity

Nov 01 '21 17:11 finnagin

On branch issue1695. Should test out the different averaging methods when combining multiple SemMedDB edges and see which ones we like. Issue #1684 is needed for the other items

Jun 05 '22 19:06 dkoslicki

Closing as @mfl15 's approach for filtering SemMedDB will likely fix this issue

Jun 19 '24 16:06 dkoslicki