natsort icon indicating copy to clipboard operation
natsort copied to clipboard

6 locale tests fail to properly sort ä

Open nieder opened this issue 1 year ago • 6 comments

Describe the bug 6 tests fail, all seem to have trouble with where to place ä (lower case 'a' with umlaut). The basic assertion is something like this for all of them:

E       AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3]
E         At index 0 diff: 'b' != 'ä'
E         Full diff:
E         - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3]
E         + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3]

See the full log at the bottom.

Expected behavior Tests should pass.

Environment (please complete the following information):

  • Python Version: 3.10.4
  • OS: macOS 10.14.6
  • If the bug involves LOCALE or humansorted:
    • Is PyICU installed? It is not installed
    • Have tried LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 and other variations, as well as unsetting it. All lead to the same problem.

To Reproduce See the full log below.

Full test output LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 python3.10 -m pytest -p no:relaxed -vv =========================================================== test session starts =========================================================== platform darwin -- Python 3.10.4, pytest-7.4.4, pluggy-1.4.0 -- /sw/bin/python3.10 cachedir: .pytest_cache benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/sw/build.build/natsort-py310-8.4.0-1/natsort-8.4.0/.hypothesis/examples') Using --randomly-seed=1000599648 rootdir: /sw/build.build/natsort-py310-8.4.0-1/natsort-8.4.0 plugins: benchmark-3.4.1, hypothesis-6.42.1, randomly-3.15.0, datadir-1.5.0, asyncio-0.21.1, flaky-3.8.1, mock-3.12.0, xdist-3.5.0, cov-4.1.0, requests-mock-1.12.1 asyncio: mode=strict collected 333 items tests/test_natsorted.py::test_natsorted_sorts_an_odd_collection_of_strings[ns.NUMAFTER-expected1] PASSED [ 0%] tests/test_natsorted.py::test_natsorted_locale_bug_regression_test_140 SKIPPED (requires a functioning locale library to run) [ 0%] tests/test_natsorted.py::test_natsorted_consistent_ordering_with_nan_and_friends[ns.DEFAULT-expected0] PASSED [ 0%] tests/test_natsorted.py::test_natsorted_returns_list_in_reversed_order_with_reverse_option PASSED [ 1%] tests/test_natsorted.py::test_natsort_sorts_consistently_with_presort PASSED [ 1%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.GROUPLETTERS-expected3] PASSED [ 1%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[520-expected6] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_can_sort_with_or_without_accounting_for_sign[ns.SIGNED-expected1] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.UNGROUPLETTERS-expected2] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_and_ignore_exponents[51] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.LOWERCASEFIRST-expected2] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.LOWERCASEFIRST-expected2] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.DEFAULT-expected0] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_handles_numbers_and_filesystem_paths_simultaneously PASSED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] FAILED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4104-expected5] FAILED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] FAILED [ 5%] tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_en FAILED [ 5%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[640-expected3] PASSED [ 5%] tests/test_natsorted.py::test_natsorted_consistent_ordering_with_nan_and_friends[ns.NANLAST-expected1] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_as_version_numbers PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_ints_which_is_default[ns.DEFAULT1] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.DEFAULT-expected0] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4608-expected3] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_ints_which_is_default[ns.DEFAULT0] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_handles_mixed_types[ns.DEFAULT-expected0] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_can_sorts_paths_same_as_strings PASSED [ 8%] tests/test_natsorted.py::test_natsorted_sorts_an_odd_collection_of_strings[ns.DEFAULT-expected0] PASSED [ 8%] tests/test_natsorted.py::test_natsorted_sorts_mixed_ascii_and_non_ascii_numbers PASSED [ 8%] tests/test_natsorted.py::test_natsorted_handles_mixed_types[ns.NUMAFTER-expected1] PASSED [ 9%] tests/test_natsorted.py::test_natsorted_locale_bug_regression_test_109 PASSED [ 9%] tests/test_natsorted.py::test_natsorted_raises_type_error_for_non_iterable_input PASSED [ 9%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.LOWERCASEFIRST-expected1] PASSED [ 9%] tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_de FAILED [ 10%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.IGNORECASE-expected1] PASSED [ 10%] tests/test_natsorted.py::test_natsorted_handles_filesystem_paths PASSED [ 10%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.IGNORECASE-expected2] PASSED [ 11%] tests/test_natsorted.py::test_natsorted_recurses_into_nested_lists PASSED [ 11%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.DEFAULT-expected0] PASSED [ 11%] tests/test_natsorted.py::test_natsorted_can_sort_with_or_without_accounting_for_sign[ns.DEFAULT-expected0] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_path_extensions_heuristic PASSED [ 12%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.UNGROUPLETTERS-expected1] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_and_ignore_exponents[50] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4616-expected7] PASSED [ 13%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] FAILED [ 13%] tests/test_natsorted.py::test_natsorted_applies_key_to_each_list_element_before_sorting_list PASSED [ 13%] tests/test_natsorted.py::test_natsorted_with_mixed_bytes_and_str_input_raises_type_error PASSED [ 14%] \ tests/test_natsorted_convenience.py::test_index_natsorted_can_presort PASSED [ 98%] tests/test_natsorted_convenience.py::test_order_by_index_sorts_list_according_to_order_of_integer_list PASSED [ 98%] tests/test_natsorted_convenience.py::test_index_realsorted_is_identical_to_index_natsorted_with_real_alg PASSED [ 99%] tests/test_natsorted_convenience.py::test_index_natsorted_applies_key_function_before_sorting PASSED [ 99%] tests/test_natsorted_convenience.py::test_index_natsorted_returns_integer_list_of_sort_order_for_input_list PASSED [ 99%] tests/test_natsorted_convenience.py::test_as_ascii_converts_bytes_to_ascii PASSED [100%] ================================================================ FAILURES ================================================================= __________________________________ test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] ___________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['0', 1.5, '2', 3, 'ä', 'Ä', ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E + ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] tests/test_natsorted.py:318: AssertionError _____________________________________ test_natsorted_handles_mixed_types_with_locale[4104-expected5] ______________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = 4104, expected = ['ä', 'Ä', 'b', 'Z', '0', 1.5, ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E At index 0 diff: 'b' != 'ä' E Full diff: E - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] tests/test_natsorted.py:318: AssertionError ____________________________________ test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] ____________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['0', 1.5, '2', 3, 'ä', 'Ä', ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E + ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] tests/test_natsorted.py:318: AssertionError ___________________________________________ test_natsorted_can_sort_locale_specific_numbers_en ____________________________________________ @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_can_sort_locale_specific_numbers_en() -> None: given = ["c", "a5,467.86", "ä", "b", "a5367.86", "a5,6", "a5,50"] expected = ["a5,6", "a5,50", "a5367.86", "a5,467.86", "ä", "b", "c"] > assert natsorted(given, alg=ns.LOCALE | ns.F) == expected E AssertionError: assert ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] == ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b', 'c'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b', 'c'] E ? ----- E + ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] E ? +++++ tests/test_natsorted.py:272: AssertionError ___________________________________________ test_natsorted_can_sort_locale_specific_numbers_de ____________________________________________ @pytest.mark.usefixtures("with_locale_de_de") def test_natsorted_can_sort_locale_specific_numbers_de() -> None: given = ["c", "a5.467,86", "ä", "b", "a5367.86", "a5,6", "a5,50"] expected = ["a5,50", "a5,6", "a5367.86", "a5.467,86", "ä", "b", "c"] > assert natsorted(given, alg=ns.LOCALE | ns.F) == expected E AssertionError: assert ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] == ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b', 'c'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b', 'c'] E ? ----- E + ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] E ? +++++ tests/test_natsorted.py:279: AssertionError __________________________________ test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] __________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['ä', 'Ä', 'b', 'Z', '0', 1.5, ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E At index 0 diff: 'b' != 'ä' E Full diff: E - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] tests/test_natsorted.py:318: AssertionError ========================================================= short test summary info ========================================================= FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] - AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4104-expected5] - AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] - AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] FAILED tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_en - AssertionError: assert ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] == ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b... FAILED tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_de - AssertionError: assert ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] == ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b... FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] - AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] ================================================ 6 failed, 326 passed, 1 skipped in 13.20s ================================================

nieder avatar Sep 07 '24 14:09 nieder

It seems that installing pyicu fixed the problem

====================================================================== 333 passed in 12.98s ======================================================================

Will try a few other iterations, but most likely fixed. Perhaps pyicu should be made an always dependency (not just for extras) for macOS?

nieder avatar Sep 07 '24 22:09 nieder

I cannot reproduce the problem. I am running on macOS 14.6.1 without pyicu installed.

You say you are on macOS 10.14.6 - that sounds very old, are you sure that is the correct version?

SethMMorton avatar Sep 08 '24 04:09 SethMMorton

You say you are on macOS 10.14.6 - that sounds very old, are you sure that is the correct version?

Ah, silly me, it's just that sometimes the leading 10 is omitted, so we are likely on the same version.

SethMMorton avatar Sep 08 '24 04:09 SethMMorton

Nope. Definitely on 10.14.6 image

I have access to a macOS 13 system and will try it there later +/- pyicu.

nieder avatar Sep 08 '24 09:09 nieder

Tested on 13.6.3 machine. All tests pass here without pyicu installed. So the native macOS library for older macOS seems to behave different than more modern libraries? I saw several other closed issues referencing FreeBSD, so since some of the underlying macOS things are BSD derived, perhaps that's a common thread that has since diverged/improved?

nieder avatar Sep 08 '24 13:09 nieder

I did some digging. The special handling that attempts to fix sorting on macOS was implemented by 2015. It looks like Mojave (10.14) was released in 2018. natsort unit tests worked fine all through that time, so I'm not confident that the OS version is responsible.

SethMMorton avatar Sep 09 '24 00:09 SethMMorton

I'm going to close this as "won't fix" because I do not plan to support very old operating systems.

SethMMorton avatar Aug 01 '25 15:08 SethMMorton