pygraphistry
pygraphistry copied to clipboard
[FEA] Edge reductions
Building off node reductions ( https://github.com/graphistry/pygraphistry/issues/193 ), edge reductions are also useful.
The most typical case is multiedge handling:
Ex: collapsing directed edges today:
g2 = g.edges(g._edges.drop_duplicates(subset=[g._src, g._dst]))
g2 = g.edges(g._edges.goupby([g._src, g._dst]).agg({'col1': 'sum'}).reset_index()
However, that:
- Is non-obvious
- Cases:
- Unified handling of symmetric edges
- Partial bundling:
groupby([src,dst,type]), only bundling wheretype=x, ... - Invert: turn a bundle into a node
- Even more work to include aggregates
Potential designs
Direct:
g.reduce_multiedges(lazy=True, where=index, splits=index, symmetric=True, agg={
'e_col_1': 'sum',
'e_col_2': lambda edges: edges.drop_duplicates(['src','dst']) # or triples?
})
Lazy: Defer reductions, so nodes/edges can be rebound, and results not concretized until .force() / .plot(), upon which the compute graph is wiped out.