richie
richie copied to clipboard
Write & maintain comprehensive integration tests for the search engine
Feature Request
Is your feature request related to a problem or unsupported use case? Please describe. As we're starting to improve the Search engine and nearing production releases, avoiding regressions becomes key to making steady progress.
There are currently no integration tests for the "search" app, and particularly the main course search endpoint. New "tweaks" could easily break previously fixed issues again, without it being obvious to the developers (due to the third-party nature of the ElasticSearch integration).
Describe the solution you'd like We should write extensive integration tests for the main course search endpoint. This should help us solidify broad behaviors and special cases we already treated and avoid regressions when fixing other issues or refactoring the code.
The following is a work-in-progress list of the necessary tests.
- Organizations filter:
- [x] filtering by an organization (org) surfaces only courses linked to this org;
- [x] filtering by two (or more) orgs surfaces only courses linked to any of those;
- Categories filter:
- [ ] filtering by a category (cat) surfaces only courses linked to this cat, or any of its children;
- [ ] filtering by two (or more) cats surfaces only courses linked to any of those, or any of their children;
- [ ] filtering by a child cat does not surface courses not linked to it, but linked to its parent or siblings;
- Custom filters:
- [ ] filtering by a custom filter surfaces only courses that match its filtering query fragment;
- [ ] filtering by two (or more) custom filters surfaces only courses that match any of their filtering query fragments;
- Full-text search:
- [ ] searching for some text surfaces only courses that contain it in their description or title;
- [ ] searching for text with different case gives the same results (surfaces courses that contain it no matter the case);
- [x] searching for text with or without diacritics gives the same results (surfaces courses that contain whether the latin-1 version or the diacritics);
- [ ] EXCEPT searching for a cat or org name in full-text search surfaces courses linked to the cat/org but that do not contain it in their title/description, but with (generally?) a lower score than those that contain the name in their title/description, event if they are not linked to the cat/org;
- Combined searches:
- [ ] combining several of those kinds of search modes only surfaces courses that match at least once in each mode (eg.
(cat A || cat B) && (org X) && (full-text));
- [ ] combining several of those kinds of search modes only surfaces courses that match at least once in each mode (eg.
- Default
- [ ] omitting all filters surfaces all courses.
We also need to ensure facet counts (and soon, filter definitions) produce the expected output for all of the above cases.
Describe alternatives you've considered N/A
Discovery, Documentation, Adoption, Migration Strategy
This should be a python test file, to benefit from the elasticsearch.py client. We can take this opportunity to describe the specified behaviors in plain text in the docstrings.
It could also be immensely helpful for others who want to fine tune our search functions to their own datasets and expected user experiences.
Do you want to work on it through a Pull Request? 🤷♂️