wordcamp.org
wordcamp.org copied to clipboard
Ignore locale when sorting for shortcode "camptix_attendees"
Describe the bug
Showing attendees by camptix_attendees
shortcode ignore local sorting.
To reproduce
Steps to reproduce the behavior:
- Go to https://krakow.wordcamp.org/2024/czym-w-ogole-jest-wordcamp/uczestnicy/
- Scroll down to the last atendees.
- See error - at the end you can se "Łukasz Jasiński" and "Łukasz Mastalski" - they should be shown after the "L" letter before "M" letter.
Expected behavior
Proper sorting by name according to the site locale.
Screenshots / Screencasts
WordCamp
If this is a problem on a specific WordCamp's site, list the site or page URL here.
I think this is actually due to the data storage type, as we don't do any special for sorting in locales, just on post_title
which in this case is the attendee name (https://github.com/WordPress/wordcamp.org/blob/f79c65ea40e3ba7ae54c07bdd932d9ff72796a15/public_html/wp-content/plugins/camptix/addons/shortcodes.php#L207-L219).
Unsure how we've done this before, but I think we'd have to change that in the database (which might require some assistance to do so) and there is a chance of data loss.
@dd32 have you known us do this before?
Actually - if we sort by polish collation (utf8mb4_polish_ci) then the results look correct, but unsure if we can push this through wp_query.
Ah - this isn't possible through WP Query, due to the santitization of the orderby parameter: https://github.com/WordPress/wordpress-develop/blob/6.5/src/wp-includes/class-wp-query.php#L1663-L1685
I think this is actually due to the data storage type
Yup, because it's stored as utf8mb4_unicode_ci
it's going to be sorted by that collation - Which apparently uses an older set of weight keys.
I think we'd have to change that in the database
My initial response is "noooo". However, it turns out that core does use the utf8mb4_unicode_520_ci
collation when possible:
https://github.com/WordPress/wordpress-develop/blob/473e2554db8e547a07a16f73080ca49c0c30b89f/src/wp-includes/class-wpdb.php#L894-L897
(utf8mb4_unicode_ci uses v4 utf8 weighting keys for sort, utf8mb4_unicode_520_ci uses v5.2 utf8 weighted sorts)
AFAIK we don't use that, because a combination of HyperDB + how utf8mb4 is forced on..
Upon looking into it, $wpdb->has_cap( 'utf8mb4_520' )
returns false on WordCamp. But HyperDB appears to support it But it turns out not to, because $wpdb->db_version()
return 5.5.5
as https://github.com/WordPress/wordpress-develop/commit/fed98bd9ef9a232d102c41e74944d3c21cd6183e is not applied to HyperDB..
The above won't "fix" this though; it'll just let new tables be created using 520 AFAIK, we could adjust existing tables though.
isn't possible through WP Query
This would be possible via the posts_request
SQL filter, for example, the following query sorts as you'd expect:
SELECT post_title FROM wc_posts WHERE post_type = 'tix_attendee' ORDER BY post_title COLLATE 'utf8mb4_unicode_520_ci' ASC
Thanks @dd32 - I think I prefer the non-update options via the posts_request
filter.
We'd need to adjust the query here: https://github.com/WordPress/wordcamp.org/blob/f79c65ea40e3ba7ae54c07bdd932d9ff72796a15/public_html/wp-content/plugins/camptix/addons/shortcodes.php#L275-L292 to include a suppress_filters
to false
since the filter wouldn't run, but otherwise this could work.
to include a suppress_filters to false
I forgot get_posts
sets that by default.. As cray-cray as it sounds, if that introduces problems, I would then say use the query
filter..