reach We suspect that short titles are parsed less accurately. Is there a way of checking whether of the titles that fail to be matched, is there a higher rate of short titles?

We suspect that short titles are parsed less accurately. Is there a way of checking whether of the titles that fail to be matched, is there a higher rate of short titles?

Open aoifespenge opened this issue 5 years ago • 0 comments

@aoifespenge commented on Thu Jul 18 2019

@nsorros commented on Fri Jul 19 2019

To a certain extent that is a feature, as we have decided to filter out any matches on short titles. What we can do is break down accuracy by title length grouped in some meaningful range, e.g. 20-25, 26-30 etc.

Which criterion to use for bias or accuracy also applies here?

@aoifespenge commented on Thu Feb 13 2020

The question remains here on whether we think that short title is something worth testing bias on. @nsorros

@nsorros commented on Mon Feb 17 2020

Similarly this can be consolidated into a next or todo

@ivyleavedtoadflax commented on Mon Feb 17 2020

It's relatively trivial to add this into the fairness assessment https://github.com/wellcometrust/reach/issues/360, but will also be blocked by https://github.com/wellcometrust/reach/issues/48 in the meantime.

Feb 26 '20 11:02 aoifespenge

reach reach copied to clipboard

We suspect that short titles are parsed less accurately. Is there a way of checking whether of the titles that fail to be matched, is there a higher rate of short titles?

reach
reach copied to clipboard