solr-power
solr-power copied to clipboard
Ordering by post_title ASC returns unexpected result order
WordPress has this internal logic we should be following too:
The problem with the current behavior is that orderby=>score
is applied to all use of solr_integrate
with other WP_Query instances too.
The problem with the current behavior is that
orderby=>score
is applied to all use ofsolr_integrate
with other WP_Query instances too.
Actually, this is incorrect. Solr Power correctly handles customized orderby
parameters.
Taking a step back, the symptom was observed working on the Solr Demo Redux site. Specifically, the movies are sorted by title on the index views (home, meta queries, etc.). When switching between MySQL and Solr, the list of movies would change, which isn't expected.
In debugging this issue, post_title asc
is being passed correctly to Solr:
[17-Feb-2017 18:25:26 UTC] array (
'responseHeader' =>
array (
'status' => 0,
'QTime' => 5,
'params' =>
array (
'q.alt' => '((rating_str:"TV-14"))AND(post_type:movie)',
'facet.field' =>
array (
0 => '{!key=genre_taxonomy}genre_taxonomy',
1 => '{!key=language_taxonomy}language_taxonomy',
2 => '{!key=country_taxonomy}country_taxonomy',
3 => '{!key=sponsor_str}sponsor_str',
),
'json.nl' => 'flat',
'bf' => 'post_title^25 post_content^50',
'hl' => 'true',
'fl' => '*,score',
'start' => '0',
'q.op' => 'OR',
'sort' => 'post_title asc',
'rows' => '18',
'hl.simple.pre' => '<b>',
'q' => '',
'defType' => 'dismax',
'hl.simple.post' => '</b>',
'omitHeader' => 'false',
'facet.mincount' => '1',
'hl.fl' => 'post_content',
'wt' => 'json',
'facet' => 'true',
'hl.highlightMultiTerm' => 'true',
),
),
For some reason then, Solr doesn't actually sort correctly by post_title
.
I was able to work around this issue by ordering by post_name
: https://github.com/danielbachhuber/solr-demo-redux/commit/3bc1cc54d41461e4733d4c71fff07d74c4173e50
According to the docs (via this SO thread):
Sorting can be done on the "score" of the document, or on any multiValued="false" indexed="true" field provided that field is either non-tokenized (ie: has no Analyzer) or uses an Analyzer that only produces a single Term (ie: uses the KeywordTokenizer)
post_title
is a text_*
field and tokenized:
<field name="post_title" type="text_lws" indexed="true" stored="true"/>
It's quite possible that our default schema.xml
could be improved. I can't think of a good reason why we have both post_name
and post_title
in there...
I can't think of a good reason why we have both
post_name
andpost_title
in there...
They're different fields?