Searching channels for literal strings like "C++20" produces non-useful results
This issue has been migrated from #12609.
I first raised this issue in https://github.com/vector-im/element-web/issues/22010; apparently since the room is not encrypted, the search is done on server side, so I was advised to reopen this issue here.
Description
Searching channels for literal strings like "C++20" produces non-useful results. This happens at least in the room #please:matrix.org.
Steps to reproduce
Search a channel with lots of history for strings like "C++20" or "C++20 modules". Try all of the following (by entering these in the Element search field):
- C++20
- "C++20"
- C\+\+20
Expected results
At least one of the searches finds messages containing one of the literal strings "c++20" or "C++20".
Actual results:
I got a ton of unrelated hits; the logic seems to be something like:
- it matches all messages with words starting with c and that contain "20"
- of those, it (I don't know if the server or Element) highlights fully all words that start with c, only the letter c in the middle of other words, and all instances of "20".
Here's an example:

Version information
I do not understand enough about Matrix to be confident in my answers; I merely use it over an Element web interface provided by an organization (https://chat.hacklab.fi). Its Settings->Help and About page says "Homeserver is https://matrix.hacklab.fi".
I managed to get a sensible looking answer for the Synapse version by running this command:
$ curl https://matrix.hacklab.fi/_synapse/admin/v1/server_version
{"server_version":"1.57.0","python_version":"3.9.2"}
This is all I know about the environment, but if you need to know something more, I can ask around.
This is a limitation of how the tokenizer/indexer works https://github.com/bkil/wiki/blob/master/en/dev/matrix-full-text-search.md