uwazi icon indicating copy to clipboard operation
uwazi copied to clipboard

Option to show all thesauri values in filter even when zero entities correspond

Open llfinch opened this issue 2 years ago • 1 comments

Is your feature request related to a problem? Please describe. This comes at the suggestion of a partner. Currently in Uwazi, you create a thesaurus with values, which you then hook into a select or multiselect property, which you can then opt to use as a filter. If there are values in the thesaurus that are not applied to any entities in your collection, those values do not show at all in the filters list. This is nice in some circumstances because it allows for a "clean" filters list.

However, there are certain other research scenarios where it would be better to not hide the "unused" values and instead explicitly show something like "Value: 0 entities". It can be important for the researcher to know all of the possible values in the thesaurus and explicitly see that there are zero instances of them among the entities of a database. Because the absence of entities with that value is important information too.

For example, take a database of refugee resettlement policy from around the world. Imagine we have a thesaurus with four possible topics: Health; Education; Poverty; and Gender. Now imagine that the policies in the database only mention Health and Education; none mention Poverty or Gender. As a researcher visiting the database, I'll only see Health and Education in the filters sidebar. Unless the collection has a specially built Methodology page that explains this, I don't have any idea that the curators of the collection have included Poverty and Gender as potential values too, but that none of the policies mention these values. I might simply think that the curators only focus on Health and Education to the exclusion of everything else. But I'd be missing the real story here: that there's a huge documented gap in refugee resettlement policy, in that the policies are not addressing issues related to poverty or gender explicitly.

If we were to show something like this, we offer valuable information to that researcher in a way which resolves their doubts:

  • Education: 29 entities
  • Health: 12 entities
  • Poverty: 0 entities
  • Gender: 0 entities

Describe the solution you'd like Maybe a checkbox in the configuration of a select or multiselect property that asks the admin user if they want to "Show thesaurus values in the filter list even when zero entities correspond to them"

llfinch avatar Aug 31 '22 16:08 llfinch

Just a note for when this feature is being discussed: if we go this route, it really must be optional, as there are scenarios where we could have hundreds or even thousands of values, so the "hiding values with 0" is probably a good default.

Another important thing to consider is that, I don't think Elastic Search is aware of "values that could get assigned but aren't there", so this is probably something we need to ADD to the search after the results are returned. That is, ES returns whatever it found, and we would need to actually parse all the possible values to inject back those that are missing.

I do agree there are some scenarios where this information is useful and "tells the story" of the data.

RafaPolit avatar Aug 31 '22 17:08 RafaPolit