pandas-sets
pandas-sets copied to clipboard
Set-oriented Operations in Pandas
Pandas Sets: Set-oriented Operations in Pandas
If you store standard Python sets or frozensets in your Series or DataFrame objects, you'll find this useful.
The pandas_sets package adds a .set accessor to any pandas Series object;
it's like .dt for datetime or .str for string, but for set.
It exposes all public methods available in the standard set.
Installation
pip install pandas-sets
Just import the pandas_sets package and it will register a .set accessor to any Series object.
import pandas_sets
Examples
import pandas_sets
import pandas as pd
df = pd.DataFrame({'post': [1, 2, 3, 4],
'tags': [{'python', 'pandas'}, {'philosophy', 'strategy'}, {'scikit-learn'}, {'pandas'}]
})
pandas_posts = df[df.tags.set.contains('pandas')]
pandas_posts.tags.set.add('data')
pandas_posts.tags.set.update({'data', 'analysis'})
pandas_posts.tags.set.len()
Notes
- The implementation is primitive for now. It's based heavily on the pandas' core
StringMethodsimplementation. - The public API has been tested for most expected scenarios.
- The API will need to be extended to handle
NAvalues appropriately.