realm-java
realm-java copied to clipboard
Diacritic-insensitive search
Hi,
I saw that Realm cocoa v2.5.0 added support for diacritic-insensitive search.
Are you planning to make this available in a future release of Realm Android?
I know that a normalised column could be added to implement this feature (see this Realm Android issue and stackoverflow issue) but would this be the most efficient way of doing it?
Thanks
We'll talk this over and see what kind of priority that we can give it. Thank you for showing your interest in having it added to the Java binding.
Yes, this is something we want to add, but we don't have a timeline yet.
The API is probably going to be slightly challenging.
Right now we have
realm.where(Person.class).equalTo(String field, String value, Case casing);
That leaves use with a few options
- Add
Diacritic
enum
realm.where(Person.class).equalTo("name", "John");
realm.where(Person.class).equalTo("name", "John", Case.INSENSITIVE, Diacritic.SENSITIVE);
// Should we also add only a diacritic option ? The combinatorial explosion makes me think no.
realm.where(Person.class).equalTo("name", "John", Diacritic.SENSITIVE);
- (+) NSPredicate only seem to support Case/Diacritric as well.
- (+) Works without breaking existing API.
- (-) Doesn't scale very well if we add more options. We need to consider if full-text search might do that.
- (-) Diacritic searches must also specify the Case setting
- Replace enum with bit flags:
public class QueryOption {
public final static int CASE_SENSITIVE = 0x01;
public final static int CASE_INSENITIVE = 0x02;
public final static int DIACRITIC_SENSITIVE = 0x04;
public final static int DIACRITIC_INSENSITIVE = 0x08;
}
realm.where(Person.class).equalTo("name", "John");
realm.where(Person.class).equalTo("name", "John", QueryOption.CASE_INSENTIVE | QueryOption.DIACTRIC_INSENTIVE);
realm.where(Person.class).equalTo("name", "John", QueryOption.DIACRITIC_INSENITIVE);
- (+) Much more flexible, also in terms of adding new features
- (-) We replaced the original case boolean with an enum because of readability and auto-complete issues. Using bit flags will put us back into that position.
- (-) Breaking change, so must either wait for 4.0 or we need to duplicate a lot of methods.
- Other options
Not sure what those could be?
I probably lean towards 1) with the acceptance that for diacritic insensitive searches you also need to supply the case parameter. Thoughts @realm/java ?
What about this:
public class SortParam {
Case caseParam;
public SortParam(Case caseParam) {
this.caseParam = caseParam;
}
}
public enum Case {
CASE_SENSITIVE(true),
CASE_INSENSITIVE(false);
private final boolean value;
public static final SortParam SENSITIVE = new SortParam(CASE_SENSITIVE);
public static final SortParam INSENSITIVE = new SortParam(CASE_INSENSITIVE);
}
RealmQuery equalTo(String, String, SortParam);
// It will still compile with old code although the API signature changed
realm.where(Person.class).equalTo("name", "John", Case.INSENSITIVE);
// It also has some flexibility to support more sort params:
realm.where(Person.class).equalTo("name", "John", new SortParam(Case.CASE_INSENSITIVE, Diacritic.DIACRITIC_SENSITIVE));
I guess it could work, and even though it is API breaking, most will probably not have to do any changes. I would be a bit concerned about how complicated it becomes to combine options (even more than both options I outlined).
@cmelchior how about:
public class StringFilter {
final Case case;
final Diacritic diacritic;
public static class Builder {
private Case case = Case.SENSITIVE;
private Diacritic diacritic = Diacritic.INSENSITIVE;
public Builder withCase(Case case) {
checkNotNull(case);
this.case = case;
return this;
}
public Builder withDiacritic(Diacritic diacritic) {
checkNotNull(diacritic);
this.diacritic = diacritic;
return this;
}
public StringFilter build() {
return new StringFilter(case, diacritic);
}
}
private StringFilter(Case case, Diacritic diacritic) {
this.case = case;
this.diacritic = diacritic;
}
}
Although still a breaking change; adding Diacritic
to Case
wouldn't really make sense
@Zhuinden Have you tried using that?
realm.where(Person.class).equalTo("name", "John", new StringFilter.Builder().withCase(Case.SENSITIVE).withDiacritic(Diacritic.INSENSITIVE).build();
Looks extremely long and complicated to me :)
.....ah, then I think you should probably just add Diacritic
enum for now
(i didn't think of the outrageous length XD)
actually, that makes the bit-wise operators much easier to reason about.
Blocking this pr: https://github.com/realm/realm-core/issues/1082
Hey guys! Any update on this? Do you have a timeline for this feature ? Thanks!
@RobinCaroff Sorry. There is no timeframe yet. Unicode libraries on our supported platform are slightly different, so it needs some discussions and time to support it by core. https://github.com/realm/realm-core/issues/1082
Hello, after almost one year, any update on this?
Almost a year and a half now. Has this been considered for a future roadmap?
Eh. I personally do the same approach as with SQLite: create a new field that stores the normalized (accent-less) value, and query against that.
(i am not a realm member)
Any news about this one?
Not to forget... Any news on this 5-years old request?