sdk icon indicating copy to clipboard operation
sdk copied to clipboard

Add String.equalsIgnoreCase

Open loic-sharma opened this issue 1 year ago • 2 comments

Today case insensitive string comparisons are unnecessarily expensive:

a.toLowerCase().compareTo(b.toLowerCase()).

Adding String.equalsIgnoreCase might be more feasible now that Dart depends on ICU.

See previous requests https://github.com/dart-lang/sdk/issues/4805 and https://github.com/dart-lang/sdk/issues/13324.

loic-sharma avatar Dec 20 '23 05:12 loic-sharma

Dart, the language, doesn't depend on ICU. The native implementation might use it, but any library feature also needs to work on the web and in Wasm.

if those also provide access to case agnostic equality, then adding it is an option. It's probably going to be equalsIgnoresCase, rather than compare..., since that only requires case canonicalization, not defining an order on the canonicalized characters.

(But if all platforms have such an ordering, we can use it too.)

lrhn avatar Dec 20 '23 07:12 lrhn

As we already have core functions toLowerCase and toUpperCase, why shouldn't we have core functions equalsIgnoreCase and compareIgnoreCase that behave exactly as if their arguments had been converted to upper or lower case, but are simply more efficient? The quiver package implementations of equalsIgnoreCase and compareIgnoreCase simply convert both arguments to lower case, so they are not efficient. My point is that the complexities of Unicode and internationalisation wouldn't be fully solved by such a function, but nor are they by using the core functions toLowerCase and toUpperCase functions

martin-east avatar Aug 23 '24 16:08 martin-east

It doesn't look like JavaScript has an optimized equalsIgnoreCase.

It could always be implemented in JS, though. If the desired behavior is

  • equivalent to a.toLowerCase().compareTo(b.toLowerCase()),
  • just more efficient,

then Dart could have a String.equalsIgnoreCase that the VM implements in an optimized way and that for web and wasm just does a.toLowerCase().compareTo(b.toLowerCase()). Then that would be never worse than what people are doing now (writing a.toLowerCase().compareTo(b.toLowerCase()) themselves), but on the VM would get optimized.


I'm not sure that's exactly the desired semantics, though. Here's what Java does for String#equalsIgnoreCase, apparently:

Two strings are considered equal ignoring case if they are of the same length and corresponding Unicode code points in the two strings are equal ignoring case. Two Unicode code points are considered the same ignoring case if at least one of the following is true:

  • The two Unicode code points are the same (as compared by the == operator)
  • Calling Character.toLowerCase(Character.toUpperCase(int)) on each Unicode code point produces the same result

So for example dotless "ı" U+0131 and the familiar dotted "i" are equal under those Java semantics:

  print('ı'.toUpperCase().toLowerCase() == 'i'); // true

even though they're unequal under the lower-case-then-compare version that people are accustomed to writing by hand:

  print('ı'.toLowerCase() == 'i'.toLowerCase()); // false

… Well, at least in that example those semantics don't seem like an improvement over a.toLowerCase().compareTo(b.toLowerCase()). But if JS ever does grow a String.prototype.equalsIgnoreCase, we'd certainly want Dart to use that, and it'd be awkward if Dart had already committed to different semantics. I'm not sure how to predict what semantics TC39 would go for if they eventually add such a method.

gnprice avatar Dec 23 '24 22:12 gnprice

Basing equalsIgnoreCase on .toUpperCase() or .toLowerCase() would be a bad Unicode choice since Unicode already defines a case folding for precisely that purpose. (They define more than one, and it can still be locale dependent too.)

For ordering, the Unicode term is "collation". There is a general Default Unicode Collation Element Table, but in most cases you probably want a locale dependent ordering.

lrhn avatar Dec 25 '24 09:12 lrhn