cbioportal-frontend icon indicating copy to clipboard operation
cbioportal-frontend copied to clipboard

Filter studies by reference genome

Open BasLee opened this issue 3 years ago • 2 comments

Allow filtering by reference genome when searching for studies on the landing page. The current search filter setup can easily be extended with additional study fields. Implementation of RFC 63. Work in progress.

  • Add a search filter that allows filtering on one or more reference genomes (e.g. reference-genome:hg19,hg38)
  • Add a graphical interface to 'toggle' the reference genomes that are available in the loaded studies:

image

TODO:

  • Add end to end tests

Changes

  • Extended study search with a new filter syntax (i.e. <prefix>:<value>) in which value can be a comma separated list
  • Create SearchClause and Phrase interfaces to simplify search query modifications
  • Replace external autocomplete package with a minimal bootstrap StudySearch search box component
  • Add dropdown form to StudySearch box that renders filters as defined in QueryParser
  • Add QueryParser that contains a extendable list of search filter (see example below)
    • search prefix (e.g. reference-genome)
    • relevant study fields (e.g. study.referenceGenome)
    • rendering in drop down form (e.g. as a checkbox)

Example search filter:

{
  phrasePrefix: 'reference-genome',
  nodeFields: ['referenceGenome'],
  form: {
    input: FilterCheckbox,
    options: [...referenceGenomes],
    label: 'Reference genome',
  },
}

Checks

  • Unit tests:
    • Phrase.spec.tsx
    • SearchClause.spec.tsx
    • CheckboxFilterField.spec.tsx
    • QueryParser.spec.ts
    • textQueryUtils.spec.ts
  • End to end tests:
    • TODO
  • Manual:
    • open index page;
    • start typing in search box;
    • or open dropdown filter form

BasLee avatar Jun 21 '22 08:06 BasLee

@BasLee did some manual testing of the feature. Seems good but for a couple things:

  1. In the production version, the example query dropdown opens on focus and I think that's important behavior b/c it informs the user. I know this was a specific request of the original product designers. The behavior is a little nuanced. I think it only opens on focus IF the field is empty.

  2. Escape key closes the menu. Probably not crucial, but nice.

  3. Can you put a colon after the "Example queries" title? "Example queries:"

  4. Can you nix the line beneath the example queries when there are no controls below it. image

alisman avatar Jun 24 '22 13:06 alisman

@BasLee did some manual testing of the feature. Seems good but for a couple things: [..]

@alisman, @pvannierop : all these things were solved in 889494c, 231e9ed and 471a8ab I guess?

BasLee avatar Jul 25 '22 07:07 BasLee

Nice addition @BasLee !

I can't test the actual functionality in the public portal, but as @inodb suggested, the separator should not be present if there are no reference genome options.

In fact, I just agree with everything @inodb said :)

In terms of keeping 'reference-genome:' in the search bar - it's a really good point it takes up a lot of space that doesn't really exist. @inodb were you proposing to just show 'hg19' rather than 'reference-genome:hg19'? I worry that won't scale well - imagine a filter for 'has-mRNA:True' - just showing 'True' wouldn't be informative. I could go both ways at this point, but maybe it's ok for now, and we just need to think about a more general solution as we expand the feature beyond reference genome.

tmazor avatar Sep 09 '22 21:09 tmazor

@inodb and @tmazor Thank you for reviewing! I think we already addressed the horizontal bar issue: will look into it. The search bar now shows a filter name and value, a 'gmail-like' syntax to allow filtering on specific fields, as described in the RFC and suggested by JJ and Sjoerd. The search bar might be a bit small for larger queries: as the number of filter options increases, we might want to give the search bar a more prominent place on the index page?

BasLee avatar Sep 12 '22 09:09 BasLee

Thanks so much @BasLee! I like the wider search bar!

One more bug I found:

  • [ ] Something broken with example filtering. Production: Screen Shot 2022-09-14 at 10 34 10 AM This PR keeps showing same examples: Screen Shot 2022-09-14 at 10 34 03 AM

@tmazor: In terms of keeping 'reference-genome:' in the search bar - it's a really good point it takes up a lot of space that doesn't really exist. @inodb were you proposing to just show 'hg19' rather than 'reference-genome:hg19'? I worry that won't scale well - imagine a filter for 'has-mRNA:True' - just showing 'True' wouldn't be informative. I could go both ways at this point, but maybe it's ok for now, and we just need to think about a more general solution as we expand the feature beyond reference genome.

My thought was to remove it entirely (so no hg19 nor reference-genome: in the text field), only show the checkbox status in the dropdown, but I'm happy with the bigger search box as a solution for now. Thanks @BasLee ! I think the study text filter query language should eventually work in harmony with OQL

inodb avatar Sep 14 '22 14:09 inodb

@BasLee one more question: is there a use case to be able to select both hg19 + hg38? If not it might be cleaner to just list in the examples:

reference-genome:hg19
reference-genome:hg38

then it would also filter nicely when typing reference-...

inodb avatar Sep 14 '22 14:09 inodb

@BasLee one more question: is there a use case to be able to select both hg19 + hg38? If not it might be cleaner to just list in the examples:

reference-genome:hg19
reference-genome:hg38

then it would also filter nicely when typing reference-...

@inodb I think you want to show all studies by default? So that means selecting both hg19 and hg38 (when present)

BasLee avatar Sep 16 '22 06:09 BasLee

One more bug I found:

@inodb: fixed in 9db3773

BasLee avatar Sep 16 '22 07:09 BasLee

Thanks @BasLee ! Looks like there's still an issue vs production for queries not part of example queries:

PR: Screen Shot 2022-09-16 at 10 44 03 AM

Production: Screen Shot 2022-09-16 at 10 44 49 AM

I think these should look identical

Note tho: the production behavior is also weird. Like it's kind odd that it only filters example queries. We can follow up with that later, but i think this PR should not introduce changes to current production behavior beyond adding the reference-genome filtering

inodb avatar Sep 16 '22 14:09 inodb

Looks like there's still an issue vs production for queries not part of example queries [..]

@inodb a bit late, reading this just now... But yes, lets discuss what is should look like, with a follow up PR!

BasLee avatar Sep 19 '22 10:09 BasLee