Examine icon indicating copy to clipboard operation
Examine copied to clipboard

OrderBy string is case sensitive - Needs unit tests

Open mahgo opened this issue 8 years ago • 10 comments

We need unit tests written and working for both case sensitve and insensitive searching.

Original description:

Is there a way to order by a string case insensitively?

mahgo avatar Jan 02 '17 08:01 mahgo

I think this would probably be based on the analyzer used. What are you using?

Shazwazza avatar Jan 02 '17 23:01 Shazwazza

I'm using a custom analyzer - I needed to do this to be able to search for words with special characters in them:

public class CustomAnalyzer : StandardAnalyzer
    {

        private Util.Version matchVersion;

        public CustomAnalyzer() : base(Util.Version.LUCENE_29)
        {
            matchVersion = Util.Version.LUCENE_29;
        }

        public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
        {
            TokenStream result = new StandardTokenizer(this.matchVersion, reader);
            result = new StandardFilter(result);
            result = new LowerCaseFilter(result);
            result = new StopFilter(true, result, StopAnalyzer.ENGLISH_STOP_WORDS_SET);
            result = new ASCIIFoldingFilter(result);
            return result;
        }
    }

mahgo avatar Jan 04 '17 07:01 mahgo

Woops, I didn't mean to close this one.

mahgo avatar Jan 17 '17 12:01 mahgo

I noticed if you save the _Sort field with ToLower() on line 1278 of LuceneIndexer, this achieves this.

default:
                            field =
                                new Field(x.Key,
                                    x.Value,
                                    Field.Store.YES,
                                    lucenePolicy,
                                          Equals(lucenePolicy, Field.Index.NO) ? Field.TermVector.NO : Field.TermVector.YES
                                );
                            sortedField = new Field(SortedFieldNamePrefix + x.Key,
                                                    x.Value.ToLower(),
                                                    Field.Store.NO, //we don't want to store the field because we're only using it to sort, not return data
                                                    Field.Index.NOT_ANALYZED,
                                                    Field.TermVector.NO
                                );
                            break;

Is this the best way to do this? Maybe we could do something like this? <add Name="nodeName" EnableSorting="true" Type="STRING" CaseInsensitive="true" />

mahgo avatar Jan 26 '17 06:01 mahgo

Where are you seeing this code? This is the code for LuceneIndexer and it doesn't do this: https://github.com/Shazwazza/Examine/blob/master/src/Examine/LuceneEngine/Providers/LuceneIndexer.cs#L1283

Even in the history of this class I cannot see where that ToLower() would be.

Shazwazza avatar Feb 02 '17 03:02 Shazwazza

Sorry I don't think I was clear - I added ToLower myself as a temporary solution.

mahgo avatar Feb 02 '17 03:02 mahgo

I still feel like this could be achieved with a custom analyzer/filter for the fields you want to be case insensitive for sorting

Shazwazza avatar Feb 09 '17 03:02 Shazwazza

Do you have an example of this? Everything I've tried has not worked.

mahgo avatar Feb 12 '17 09:02 mahgo

@mahgo if this topic still interests you, I strongly recommend you last Ismail's response from this thread https://our.umbraco.com/forum/extending-umbraco-and-using-the-api/88592-examine-ordering-results In my case adding field _Sort_myfield dedicated for sorting and sorting on myfield in search criteria solves the problem

PiotrKlys avatar Jan 16 '20 15:01 PiotrKlys

I know this is super old but yes there should be a unit test created for case insensitive and sensitive searching in the code. There's probably a few ways to acheive that. I will update the task

Shazwazza avatar Jan 22 '20 00:01 Shazwazza