mojo
mojo copied to clipboard
[stdlib] Implement `__contains__` for `Tuple`, `List`, `ListLiteral` (almost)
Now that we have ComparableCollectionElement
, we can try to implement __contains__
for some common collection types using a workaround similar to what was employed in https://github.com/modularml/mojo/pull/2190. It's possible that the compiler would be able to correctly pick up a __contains__
overload with @staticmethod
, e.g. something like this might just work:
struct Bucket[T: CollectionElement]:
@staticmethod
fn __contains__[S: ComparableCollectionElement](self: Bucket[S], elem: s) -> Bool: ...
var b: Bool = MyStuff() in Bucket[MyStuff]()
Even if in
doesn't work with this, the implementation shouldn't have to change much once we roll out conditional conformance.
Tried something like this:
struct Bucket[T: CollectionElement]:
var elements: List[T]
fn __init__(inout self, elements: List[T]):
self.elements = elements
@staticmethod
fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
for i in range(self.elements.size):
if elem.__eq__(self.elements[i]):
return True
return False
However, i'm getting:
error: special method may not be a static method
fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
^
mojo: error: failed to parse the provided Mojo source module
If I remove @staticmethod
:
error: 'self' argument must have type 'Bucket[T]', but actually has type 'Bucket[C]'
fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
^ ~~~~~~~~~
mojo: error: failed to parse the provided Mojo source module
Is there something similar to c++ enable_if
that i'm missing? https://en.cppreference.com/w/cpp/types/enable_if
@laszlokindrat , @ChristopherLR , if self can be a reference? this parametrization could do the job:
struct Bucket[T: CollectionElement]:
var elements: List[T]
fn __init__(inout self):
self.elements = List[T]()
fn __contains__[C: ComparableCollectionElement](
self: Reference[Bucket[C]], elem: C
) -> Bool:
for i in range(self[].elements.size):
if elem.__eq__(self[].elements[i]):
return True
return False
fn main():
var x = Bucket[Int]()
x.elements.append(1)
if x.__contains__(1):
print("ok")
#if 1 in x: #not yet
Or if we don't use a special method:
struct Bucket[T: CollectionElement]:
var elements: List[T]
fn __init__(inout self, elements: List[T]):
self.elements = elements
@staticmethod
fn contains[C: ComparableCollectionElement](self: Bucket[C], value: C) -> Bool:
for i in range(self.elements.size):
if value.__eq__(self.elements[i]):
return True
return False
fn main():
var b = Bucket[Int](List(1, 2, 3))
if __type_of(b).contains(b, 2):
print("2 is in the bucket")
else:
print("2 is not in the bucket")
@laszlokindrat , @ChristopherLR , if self can be a reference? this parametrization could do the job:
struct Bucket[T: CollectionElement]: var elements: List[T] fn __init__(inout self): self.elements = List[T]() fn __contains__[C: ComparableCollectionElement]( self: Reference[Bucket[C]], elem: C ) -> Bool: for i in range(self[].elements.size): if elem.__eq__(self[].elements[i]): return True return False fn main(): var x = Bucket[Int]() x.elements.append(1) if x.__contains__(1): print("ok") #if 1 in x: #not yet
@rd4com Very cool, and I am actually surprised that parameter inference fails for the 1 in x
sugar. I will look into this.
@rd4com I used your trick on List.index
and it worked like a charm: https://github.com/modularml/mojo/commit/ca10f356ec6f7efff110de36632a8630d6ac6935. This basically means that we have conditional conformance in these methods, as long as you are willing to deal with the explicit dereferencing (i.e. self[]
) in the method body, which I think is totally okay. This is pretty amazing.
Would you be interested in going ahead and implementing __contains__
for the structs above (and maybe others)? Please ping me on the PRs, if so.
Regarding the 1 in x
not desugaring correctly, this seems like a parser bug. Not sure when the fix will be ready, but explicit calls to __contains__
are acceptable for the time being.
@laszlokindrat Nice, List.__contains__
is there: https://github.com/modularml/mojo/pull/2667 ,
1 in x
works on that one (rebind
, _type_is_eq
and constained
) :partying_face:
I'll try to implement Tuple.__contains__
, just don't know which way is prefered ?
Of course, you're welcome to make theses PR's @ChristopherLR aswel (and any other person too)!
Hey @rd4com and @laszlokindrat, not quite Tuple
or ListLiteral
, I've picked InlineList
just as a starting point for me to get used to the codebase :) If available would you mind checking out https://github.com/modularml/mojo/pull/2703
Should we even provide __contains__
for heterogeneous collections like Tuple
?
@rd4com I discovered that this test fails for List
when testing InlineList
:
def test_list_contains_str():
var x = List[String]("a", "b", "c")
assert_false("d" in x)
assert_true(x.__contains__("a"))
assert_false(x.__contains__("e"))
constraint failed: value type is not self.T
However this passes:
def test_list_contains_str():
var x = List[String]("a", "b", "c")
assert_true(x.__contains__("a"))
assert_false(x.__contains__("e"))
Note that i've delete assert_false("d" in x)
@rd4com I discovered that this test fails for
List
when testingInlineList
:def test_list_contains_str(): var x = List[String]("a", "b", "c") assert_false("d" in x) assert_true(x.__contains__("a")) assert_false(x.__contains__("e"))
constraint failed: value type is not self.T
However this passes:
def test_list_contains_str(): var x = List[String]("a", "b", "c") assert_true(x.__contains__("a")) assert_false(x.__contains__("e"))
Note that i've delete
assert_false("d" in x)
Yeah, this is the parser bug I mentioned before. Don't worry about it, direct call to __contains__
is fine for now. Thanks for the patch, I will take a look!
Should we even provide
__contains__
for heterogeneous collections likeTuple
?
It would be nice if we could do if "foo" in ("foo", "bar", 1): ...
.
@laszlokindrat I agree that it's rather convenient. But shouldn't we write ["foo", "bar", ...]
instead?
From the typing point of view, I find it difficult to justify having a __contains__
for a struct (Tuple
is a case of this) which iterate over all the compatible fields to test for equity. That feels highly ad-hoc, and feels like a meta programming feature.
@laszlokindrat I agree that it's rather convenient. But shouldn't we write
["foo", "bar", ...]
instead? From the typing point of view, I find it difficult to justify having a__contains__
for a struct (Tuple
is a case of this) which iterate over all the compatible fields to test for equity. That feels highly ad-hoc, and feels like a meta programming feature.
Can you give an example on how this could be misused/abused? I feel pretty strongly that, from a "python superset" point of view, "foo" in ("foo", "bar", 1)
has to work, at least for the simple cases.
The concern here isn't misuse, but rather whether we need to mimic Python in this particular case. Python's tuple
serves multiple functions, effectively encapsulating three distinct types:
-
Tuple
: heterogeneous and of fixed length. -
InlineArray
: homogeneous and of fixed length. -
List
: homogeneous and of variable length.
In a typed language like Mojo, it's logical for the latter two types to implement this interface. Moreover, a Python tuple
can be mapped to either InlineArray[PyObject]
or List[PyObject]
. Given that the default implementation for __contains__
is any(val == x for x in self)
, it seems unnecessary to introduce this ad-hoc method for non-sequence types. Similarly, we might want to reconsider the utility of using for v in (a, b, c):
in Mojo.
Syntactically, for v in (a, b, c):
et al have to be valid. Considering how basic this feature is, and how quickly someone coming from python would try to do this (anecdotally, this was actually the first line of mojo I ever tried to write), we just cannot not have it.
Moreover, a Python tuple can be mapped to either
InlineArray[PyObject]
orList[PyObject]
.
Now this is a good point; maybe the problem is that (1, 2, 3)
maps to Tuple
while it should map to InlineArray
? At the same time, Python's Tuple type is also heterogeneous, so our hands are a bit tied (admittedly, though, type checkers probably give up when they see someone iterate through a tuple). So my philosophy is this: if we can't see obvious ways to misuse/abuse, we should push for syntactic compatibility. If we discover that methods like this do pose unforeseen problems, we can 1) document sharp edges, 2) improve/fix them, 3) deprecate and then remove them.
Syntactically, for v in (a, b, c): et al have to be valid.
Totally, if we can enforce the same type requirement at compile time.
Now this is a good point; maybe the problem is that (1, 2, 3) maps to
Tuple
while it should map toInlineArray
?
Again, agreed.
Question: should we allow True in (1, 2, "a")
? Or equivalently, do we require T in Ts
for where val: T
and t: Tuple[*Ts]
. It's tricky because it makes sense to not enforce it for heterogeneous tuple literal, but not so much for homogeneous ones.
Question: should we allow
True in (1, 2, "a")
?
I don't see why not. A perfect implementation of __contains__
might not be possible today (we might need parametric traits or something to correctly model comparisons between arbitrary types), but I don't think that should stop us from having something.