datashare icon indicating copy to clipboard operation
datashare copied to clipboard

Named Entities' filters: missing items when 'Contextualized' is ticked

Open Soliine opened this issue 2 years ago • 2 comments

Describe the bug When ticking "Contextualize" in Named Entities' filters (People, Organizations, Locations) on one of our projects, no named entities' items appear whereas all named entities should appear.

To Reproduce

  1. Go to a project (not to the demo website)
  2. Open a Named Entity filter (either People, Organizations or Locations)
  3. Click on 'Contextualize'
  4. See error: no item appear in the filter
  5. Note that strangely, this bug does NOT occur on the demo website (https://datashare-demo.icij.org/#/?index=luxleaks&q=%2a) while it occurs on our server in our projects
  6. The bug does NOT occur on other filters (File types, Languages, Extraction levels, Indexing dates)

Expected behavior All Named Entities should appear when no other filter is applied and when Contextualize is ticked.

Desktop:

  • OS: iOS
  • Browser: Chrome and Firefox
  • Version: Version 100.0.4896.60 (Official Build) (x86_64) (Chrome) and 98.0.2 (64 bits) (Firefox)

Soliine avatar Apr 21 '22 08:04 Soliine

Current status

Contextualize is properly working on document attributes (langages, file types...) because there is only one attribute per document. Concerning named entities (and maybe tags) it's not properly working because there are many NE by documents.

Bug 1

Currently

When contextualize is done on one NE (eg "NE1"), the filter only show one NE (the concerned one). Instead, it should display all the NE in the documents that concerns "NE1".

Desired behavior example

  • We have 10 docs that concerns 20 NE.
  • "NE1" is collected in 2 documents "doc1" and "doc2".
  • "doc1" has one NE ("NE1") and "doc2" has 5 NE (including "NE1") so when contextualizing, we should display 5 NE with "NE1" checked.
  • Among these 5 NE, let's say we have "NE2" and we checked it, the search will look into documents concerning "NE1" OR "NE2"

https://github.com/ICIJ/datashare-client/blob/master/src/store/filters/FilterNamedEntity.js#L54-L64

Bug 2

Currently adding a query string (from the search bar for eg) to the search when a NE is checked in a contextualize mode is not working because the ES query string request is looking into filters about Named entities and not documents, that's why the search is empty.

https://github.com/ICIJ/datashare-client/blob/master/src/api/elasticsearch.js#L91-L94 https://github.com/ICIJ/datashare-client/blob/master/src/api/elasticsearch.js#L75-L84 https://github.com/ICIJ/datashare-client/blob/master/src/store/filters/FilterNamedEntity.js#L75-L81

This issue requires a deep modification of the name entity filters. Putting on hold for now.

caro3801 avatar May 04 '22 09:05 caro3801

Awaiting filter refactoring

Soliine avatar May 18 '22 09:05 Soliine

Done with issue #818

mvanzalu avatar Sep 14 '22 09:09 mvanzalu