goci icon indicating copy to clipboard operation
goci copied to clipboard

Improve searching on parent EFO terms

Open ljwh2 opened this issue 2 years ago • 0 comments

At the moment parent terms are only available in the search UI if they have been added to the database, i.e. directly annotated to a study. It would be useful for users if we pulled in all parent terms from EFO to enable better search functionality.

When searching by trait, I want to be able to easily view datasets grouped by a high level trait, regardless of whether that trait has any direct annotations.

At the moment, a user can search for a high level EFO term, so long as there is at least one study or association annotated with that term in the Catalog. For example, the term “cancer” EFO_0000311 is searchable because there are some studies annotated with cancer. This is useful for users to pull data for all types of cancer. However there are other disease/trait groupings that are not available to users. For example neuromuscular disease EFO_1001902, has several child terms (e.g. peripheral neuropathy EFO_0003100, vestibular disease EFO_0009691) that are annotated to GWAS data. But the parent term is not indexed, so users cannot view data for this group of traits. The slim solr should pull in not only terms that are used in the Catalog, but also their parent terms, up to a certain level (e.g. it could be useful to go as high as “disease” or “measurement” but no higher; or any term in EFO which has at least 2 child terms used in the Catalog - so neuromuscular disease EFO_1001902 for peripheral neuropathy EFO_0003100 and vestibular disease EFO_0009691). This needs some more detailed specification for the different branches of EFO and considering the capabilities of the search UI in terms of data loading speed. This should be after we have improved the performance of the UI.

ljwh2 avatar Mar 09 '22 18:03 ljwh2