solr-power icon indicating copy to clipboard operation
solr-power copied to clipboard

Ordering by post_title ASC returns unexpected result order

Open danielbachhuber opened this issue 8 years ago • 5 comments

WordPress has this internal logic we should be following too:

image

danielbachhuber avatar Feb 17 '17 01:02 danielbachhuber

The problem with the current behavior is that orderby=>score is applied to all use of solr_integrate with other WP_Query instances too.

danielbachhuber avatar Feb 17 '17 01:02 danielbachhuber

The problem with the current behavior is that orderby=>score is applied to all use of solr_integrate with other WP_Query instances too.

Actually, this is incorrect. Solr Power correctly handles customized orderby parameters.

Taking a step back, the symptom was observed working on the Solr Demo Redux site. Specifically, the movies are sorted by title on the index views (home, meta queries, etc.). When switching between MySQL and Solr, the list of movies would change, which isn't expected.

In debugging this issue, post_title asc is being passed correctly to Solr:

[17-Feb-2017 18:25:26 UTC] array (
  'responseHeader' => 
  array (
    'status' => 0,
    'QTime' => 5,
    'params' => 
    array (
      'q.alt' => '((rating_str:"TV-14"))AND(post_type:movie)',
      'facet.field' => 
      array (
        0 => '{!key=genre_taxonomy}genre_taxonomy',
        1 => '{!key=language_taxonomy}language_taxonomy',
        2 => '{!key=country_taxonomy}country_taxonomy',
        3 => '{!key=sponsor_str}sponsor_str',
      ),
      'json.nl' => 'flat',
      'bf' => 'post_title^25 post_content^50',
      'hl' => 'true',
      'fl' => '*,score',
      'start' => '0',
      'q.op' => 'OR',
      'sort' => 'post_title asc',
      'rows' => '18',
      'hl.simple.pre' => '<b>',
      'q' => '',
      'defType' => 'dismax',
      'hl.simple.post' => '</b>',
      'omitHeader' => 'false',
      'facet.mincount' => '1',
      'hl.fl' => 'post_content',
      'wt' => 'json',
      'facet' => 'true',
      'hl.highlightMultiTerm' => 'true',
    ),
  ),

For some reason then, Solr doesn't actually sort correctly by post_title.

I was able to work around this issue by ordering by post_name: https://github.com/danielbachhuber/solr-demo-redux/commit/3bc1cc54d41461e4733d4c71fff07d74c4173e50

danielbachhuber avatar Feb 17 '17 18:02 danielbachhuber

According to the docs (via this SO thread):

Sorting can be done on the "score" of the document, or on any multiValued="false" indexed="true" field provided that field is either non-tokenized (ie: has no Analyzer) or uses an Analyzer that only produces a single Term (ie: uses the KeywordTokenizer)

post_title is a text_* field and tokenized:

<field name="post_title" type="text_lws" indexed="true" stored="true"/>

danielbachhuber avatar Feb 17 '17 18:02 danielbachhuber

It's quite possible that our default schema.xml could be improved. I can't think of a good reason why we have both post_name and post_title in there...

joshkoenig avatar Feb 17 '17 20:02 joshkoenig

I can't think of a good reason why we have both post_name and post_title in there...

They're different fields?

danielbachhuber avatar Feb 17 '17 20:02 danielbachhuber