pandas
pandas copied to clipboard
ENH: .isin() method should use __contains__ rather than __iter__ for user-defined classes to determine presence.
Feature Type
-
[X] Adding new functionality to pandas
-
[X] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
Right now, if you would define a user class:
class MyClass:
def __init__(self):
self.collection = [1, 2, 3]
self.another_collection = [4, 5, 6]
def __contains__(self, item):
return item in self.collection
def __iter__(self):
yield from self.another_collection
and would then initialize a pandas dataframe like this:
example_dataframe = pd.DataFrame(
{
'column_name': [3, 1, 4, 6, 13],
'another_column_name': ['tolly', 'trolly', 'telly', 'belly', 'nelly']
}
)
and would then call the .isin()
method like this:
class_instance = MyClass()
example_dataframe['column_name'].isin(class_instance)
you would actually get this output:
False
False
True
True
False
which is if the values from self.another_collections
specified in __iter__
are checked, rather than self.collection
from __contains__
. I do realize that this might stem from compatibility with other libraries, but this seems counter-intuitive.
Feature Description
A solution I suggest is either to change the behavior (which might result into ruining some peoples code, I believe), or adding a flag (which would lead to more complexity, I guess).
Alternative Solutions
See above.
Additional Context
No response