mojo
mojo copied to clipboard
[Feature Request] Add larger than and larger than or equal operator support to String
Review Mojo's priorities
- [X] I have read the roadmap and priorities and I believe this request falls within the priorities.
What is your request?
Add larger than (lt) and larger than or equal (le) operator support to String.
What is your motivation for this change?
Currently, it is not possible to compare Strings easily. It is very useful when sorting Strings.
Any other details?
No response
maybe I can work on this one if upstream think this work is worth?
@zhoujingya, I have already made a snippet (needed it for a project to compare strings). Do you want me to share it with you, or would you prefer to do it yourself? My snippet is very simple. You might want to see if it is possible to use SIMDs when comparing.
maybe I can wok on this one if upstream think this work is worth?
I think @JoeLoser already marked it as a good first issue which means you are free to take it.
@VMois wow, very appreciate for your snippets.thanks
Here is what I used as a quick fix. I would double-check that :) And definitely look at SIMDs.
@always_inline
fn __lt__(self, other: String) -> Bool:
var p1 = self._buffer
var p2 = other._buffer
var min_len = len(self)
if len(other) < min_len:
min_len = len(other)
for i in range(min_len):
var ai = p1[i].cast[DType.uint8]()
var bi = p2[i].cast[DType.uint8]()
if ai > bi:
return False
if ai < bi:
return True
return len(self) <= len(other)
@always_inline
fn __le__(self, other: String) -> Bool:
return self < other or self == other
@VMois ,so maybe SIMDs is another issue?
@VMois ,so maybe SIMDs is another issue?
Up to you. You can do a simple version or look at SIMD. I am just a random guy :)
Why not just use memcmp
? String's __eq__
already does and I believe that's what Python does too.
I was thinking something like:
struct String(Sized, Stringable, IntableRaising, KeyElement, Boolable):
"""Represents a mutable string."""
alias _buffer_type = List[Int8]
var _buffer: Self._buffer_type
"""The underlying storage for the string."""
...
@always_inline
fn __lt__(self, rhs: String) -> Bool:
"""Compare this String to the RHS using LT comparison.
Args:
rhs: The other String to compare against.
Returns:
True if this String is strictly less than the RHS String and False otherwise.
"""
var len1 = len(self)
var len2 = len(rhs)
var cmp = memcmp(self._as_ptr(), rhs._as_ptr(), min(len1, len2))
return cmp < 0 or cmp == 0 and len1 < len2
@always_inline
fn __le__(self, rhs: String) -> Bool:
"""Compare this String to the RHS using LE comparison.
Args:
rhs: The other String to compare against.
Returns:
True if this String is less than or equal to the RHS String and False otherwise.
"""
return not (rhs < self)
@always_inline
fn __gt__(self, rhs: String) -> Bool:
"""Compare this String to the RHS using GT comparison.
Args:
rhs: The other String to compare against.
Returns:
True if this String is strictly greater than the RHS String and False otherwise.
"""
return rhs < self
@always_inline
fn __ge__(self, rhs: String) -> Bool:
"""Compare this String to the RHS using GE comparison.
Args:
rhs: The other String to compare against.
Returns:
True if this String is greater than or equal to the RHS String and False otherwise.
"""
return not (self < rhs)
...
I haven't testing this myself, it's just an idea :) Also, this might need to change when Mojo properly handles unicode. If we want to add this to String then something similar can be done with StringLiteral too