sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

Search: Fix word exclusion in client side search

Open cglukas opened this issue 5 months ago • 5 comments

Purpose

The built-in search is capable of excluding search terms. Thats a great feature which would make the search a lot better! Unfortunately, there are two blocking components:

  • The splitQuery will discard hyphens which define the excluded terms
  • The performTermsSearch will abort the search if any excluded term is matched

References

Closes #13892

cglukas avatar Sep 15 '25 09:09 cglukas

Regarding the CI: I don't see any changes on my PR which would trigger the current CI fail. Is this a common issue? TBH it does not look like it's affecting the master branch too 🤔. I'm a little aimless what to do here.

cglukas avatar Sep 15 '25 09:09 cglukas

Regarding the CI: I don't see any changes on my PR which would trigger the current CI fail. Is this a common issue? TBH it does not look like it's affecting the master branch too 🤔. I'm a little aimless what to do here.

That's OK, yep - I believe that is due to bug #13886 (in progress, potentially to be fixed by #13883).

jayaddison avatar Sep 16 '25 08:09 jayaddison

A delayed thought here: adding the exclusion operator to hyphenated query terms could cause unexpected results.

For example, the query example -test-case currently parses to ["example", "-test", "case"], I think.

jayaddison avatar Oct 12 '25 14:10 jayaddison

A delayed thought here: adding the exclusion operator to hyphenated query terms could cause unexpected results.

For example, the query example -test-case currently parses to ["example", "-test", "case"], I think.

Hi @jayaddison, that's a valid concern. I can imagine two scenarios:

  1. The case word gets excluded as well.

    • ✅ Benefit: This would work without a major change in the datastructures.
    • ❌Downside: If we try to find a page with the words "example" and "case" because "case" is excluded even though we never wanted to exclude all "case" occurrences.
  2. The excluded word would be test-case.

    • ✅ Benefit: We can actually exclude "test-case" and the limitation from above is fixed.
    • ❌ Downside: I think we either need to modify the searchindex generation for that because hyphenated words are not stored. This would probably introduce a lot of unwanted changes. Or we add a mechanism to chain excluded words together. Right now they are all chained with an OR condition. To get test and case to work, we would need to add a AND condition and an additional attribute that defines this for each excluded word (could also be a mapping).

Altogether, I think that this is a very valid concern. Still, I would not address this in this MR but rather open another one to get the bugfix out quickly :grin:

cglukas avatar Nov 02 '25 14:11 cglukas

Hi there,

I think the github actions are quite flaky. The "python3.11 docutils 0.20" run failed twice and the "windows" run failed once. AFAIK I never changed something related to the failing checks. And I managed to resolve these fails just by committing an empty line in the conf.py of one of the new fixtures.

It's nothing that's related to this PR. It's probably also a known issue, am I right?

cglukas avatar Nov 09 '25 14:11 cglukas