SolrSearch icon indicating copy to clipboard operation
SolrSearch copied to clipboard

Use the Extended DisMax query parser and support all queries.

Open kloor opened this issue 8 years ago • 0 comments

Currently, SolrSearch is using the standard query parser. This parser has strict syntax requirements, so it is very easy to create a query that it can't process. One example is specifying a single double-quote, as noted in #137. SolrSearch made a few attempts at fixing some of these syntax issues by replacing colons with spaces, and removing square-brackets from query strings. Those changes can present their own issues, such as preventing users from searching against a specific field.

Solr also provides a query parser known as Extended DisMax or eDisMax. This parser is much more tolerant of non-standard syntax, automatically escaping characters as necessary: https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

This pull request makes all queries use the eDisMax query parser by adding {!edismax} to the start of the query string on line 142. This seems like the simplest way to improve query parsing for plugin users, but does limit changing the parser in the future to changing that line of code.

An alternative change could be made to the solrconfig.xml file as discussed in #139. This would require plugin users to alter or replace the file copied to their Solr installation, and reload the core in Solr. But, it would allow changes to the query parser without having to edit the plugin's code.

Beyond enabling the eDisMax query parser, this pull request also makes changes to how query strings are formed:

  1. No special characters are removed from the query string, leaving that issue to the eDisMax parser. This allows for field matching such as title:"search string".
  2. The user's query string is wrapped in parenthesis, so the entire string will be ANDed to the facet and public search terms.
  3. Plus signs are added to the facet and public search terms so that they are required when searching with eDisMax. Without plus signs, the terms would only be boosted in the search results, but non-matches would also be included. The change was handled here for facets so existing bookmarks would not be impacted.

kloor avatar Apr 03 '17 15:04 kloor