xbmc
xbmc copied to clipboard
Use ICU to Compare Strings on Android
Description
Use ICU library to compare strings on Android, thus locale-aware sorting was supported.
Motivation and Context
Locale-aware sorting was not supported on Android (#15125).
How Has This Been Tested?
Tested on my phone (Mi 9 Pro with Android 9), with locales set to English (US), Chinese (Simplified), and Chinese (Traditional). See screenshots for results.
Screenshots (if appropriate):
English (US) locale
Accented letter "Ö" ordered before "O":
Chinese characters ordered by Unicode value:
Chinese (Simplified) locale
Accented letter "Ö" ordered before "O":
Chinese characters ordered by pinyin (phonetic transcriptions of Chinese characters):
Numeric ordering works:
Chinese (Traditional) locale
Chinese characters ordered by number of strokes:
Types of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] Clean up (non-breaking change which removes non-working, unmaintained functionality)
- [x] Improvement (non-breaking change which improves existing functionality)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that will cause existing functionality to change)
- [ ] Cosmetic change (non-breaking change that doesn't touch code)
- [ ] None of the above (please explain below)
Checklist:
- [x] My code follows the Code Guidelines of this project
- [x] My change requires a change to the documentation, either Doxygen or wiki
- [x] I have updated the documentation accordingly
- [x] I have read the Contributing document
- [ ] I have added tests to cover my change
- [ ] All new and existing tests passed
CI fails saying ERROR: $GIT_COMMIT (6fa2c47919cca4c27bc0778178b2f62c0dde67d3) doesn't match currently checked out revision (c41d2f0ce0f329e5b59554c8f5c0667a8e73be7f) Build step 'Execute shell' marked build as failure
. Why does this happen?
If you do a force push while Jenkins is running, it'll usually stuff it up for that PR. So any subsequent pushes will generally not run. Best to try not force push if your PR Jenkins run hasn't completed fully. Annoying, but that's what we have to deal with at times unfortunately.
Jenkins build this please
The latest change should have solved the build error, but Jenkins is not working again.
I've resumed the build on jenkins and it's green now. Just a note: before merging, we can to upload libicundk to our mirrors.
I am looking forward to this feature. Are there any further progress?
@peak3d Did you have time to look at this yet? It's a very long-standing problem unfortunately.
@taoyouh thanks you for this contribution. I can not advise you on the Android specific aspects of this PR, but I have been working to make it possible to use consistent collated sorting both in memory (itemlists) and in the databases directly. Some of what you have done needs to be modified with that in mind. I will try to explain as clearly as I can, but it may not be easy. See description in https://github.com/xbmc/xbmc/pull/17838 which should help.
Historically all sorting was done in memory, which is inefficient in some scenarios e.g. fetching entire tables from the database when only the top 10 records in a certain order are wanted. The long term aim is to do as much record sorting as possible in SQL on the db side, but we are only part the way there. Initially the approach is being prototyped for the music database, with video to follow.
Sorting in memory uses StringUtils::AlphaNumericCompare
to provide case insensitive, natural number order, (sometimes) language collated sorting of text. To get the same sorting results in SQLite a callback collation function is used, implemented in AlphaNumericCollation
https://github.com/xbmc/xbmc/blob/ebaca1f1a190e05aaab821481a2233f2c6f4d95c/xbmc/utils/StringUtils.cpp#L1194
There was no efficient way to combine these into one (I can explain in detail later if you want), hence changes to collation implementation will need to apply to both functions.
Then there are those on client/server using MySQL/MariaDB for databases. This does not support collation callbacks, but a comparable natural number sorting etc. can be implemented via database functions. Hence for consistency between memory and db sorting, AlphaNumericCompare
is adjusted to match MySQL collation when using that RDBMS. The same approach is also applied for those platforms that do not support the C++ locale facet for collation e.g Android and RPi. This is the purpose of g_langInfo.UseLocaleCollation()
, negation meaning use MySQL comparible collation because either a) using MySQL, b) locale facet is unsupported or c) user choice.
You want to ICU compare to be used on Android instead of C++ locale facet, but not as you have implemented it replacing MySQL style collation. Hence UseLocaleCollation()
will need to be set differently (maybe even renamed?) to include ICU on Android as positive "locale supported", and the Android specific code IFDEF be replacing what happens when UseLocaleCollation()
is true.
Phew! I'm sure that all sounds worse than it really is. Ask if you need me to clarify.
@DaveTBlake Finally I have time to understand this. Do you mean that I need to
a) rewrite the SQLite collation callback to use ICU too,
b) and make UseLocaleCollation()
be true when Android ICU is used for collation?
@taoyouh sorry for delay, I lost sight of this.
a) rewrite the SQLite collation callback to use ICU too,
Yes, StringUtils::AlphaNumericCollation
must result in the same order as StringUtils::AlphaNumericCompare
. Take care to do so without slowing the collation callback function.
b) and make
UseLocaleCollation()
be true when Android ICU is used for collation?
When UseLocaleCollation()
is true then use locale facet or ICU for collation on Android depending on platform.
Ensure UseLocaleCollation()
is set false (so MySQL comparible collation is perfromed) when any one of a) using a MySQL client/server setup, b) neither locale facet nor Android ICU is supported or c) user choice. This means a change to CLangInfo::UseLocaleCollation()
to check that locale collation facet or ICU is implemented on the platform. It could be sensible to rename this method.
This feature is very welcome @taoyouh but bumping it to v20 as v19 entering Beta phase with focus on testing and bug fixes, and this improvement needs more extensive work.
I guess this is the only sane option to fix #19883 properly on all platforms, not just Android.
@taoyouh sorry for delay, I lost sight of this.
a) rewrite the SQLite collation callback to use ICU too,
Yes,
StringUtils::AlphaNumericCollation
must result in the same order asStringUtils::AlphaNumericCompare
. Take care to do so without slowing the collation callback function.b) and make
UseLocaleCollation()
be true when Android ICU is used for collation?When
UseLocaleCollation()
is true then use locale facet or ICU for collation on Android depending on platform.Ensure
UseLocaleCollation()
is set false (so MySQL comparible collation is perfromed) when any one of a) using a MySQL client/server setup, b) neither locale facet nor Android ICU is supported or c) user choice. This means a change toCLangInfo::UseLocaleCollation()
to check that locale collation facet or ICU is implemented on the platform. It could be sensible to rename this method.
I think I finally finished these works after a long time. I wish my change still works with the latest master codes.
@DaveTBlake Hi, do you have some time to check this out? I just rebased it onto the latest master branch.
I'm looking forward to this feature
@DaveTBlake I implemented the AlphaNumericCollation
method but didn't find a way to test it. Could you please point out some situation under which in-database sorting will be used?
I've made some formatting changes to meet the current code style. The diffs are available in the following links:
- PR17833-0001-3929ffd861-Use-ICU-to-Compare-Strings-on-Andr.diff
- PR17833-0010-399cc4596c-Make-LangInfo-UseLocaleCollation-t.diff
For more information please see our current code style guidelines.
@DaveTBlake I implemented the
AlphaNumericCollation
method but didn't find a way to test it. Could you please point out some situation under which in-database sorting will be used?
Sorting in database is invoked by JSON API requests to fetch sorted data from the music library
AlphaNumericCollation
is used as a callback function for sorting in SQLite, so care has to be taken over designing function for speed as well as that the end results match that which in memory sorting for GUI display produces.
MySQL/MariaDB databases also need to be tested and again ensure that the results produced from in memory sorting for GUI display match.
@taoyouh this needs a rebase