pygraphistry icon indicating copy to clipboard operation
pygraphistry copied to clipboard

[FEA] Edge reductions

Open lmeyerov opened this issue 4 years ago • 0 comments

Building off node reductions ( https://github.com/graphistry/pygraphistry/issues/193 ), edge reductions are also useful.

The most typical case is multiedge handling:

Ex: collapsing directed edges today:

g2 = g.edges(g._edges.drop_duplicates(subset=[g._src, g._dst]))
g2 = g.edges(g._edges.goupby([g._src, g._dst]).agg({'col1': 'sum'}).reset_index()

However, that:

  • Is non-obvious
  • Cases:
    • Unified handling of symmetric edges
    • Partial bundling: groupby([src,dst,type]), only bundling where type=x, ...
    • Invert: turn a bundle into a node
  • Even more work to include aggregates

Potential designs

Direct:

g.reduce_multiedges(lazy=True, where=index, splits=index, symmetric=True, agg={
   'e_col_1': 'sum',
   'e_col_2': lambda edges: edges.drop_duplicates(['src','dst']) # or triples?
})

Lazy: Defer reductions, so nodes/edges can be rebound, and results not concretized until .force() / .plot(), upon which the compute graph is wiped out.

lmeyerov avatar Jan 08 '21 20:01 lmeyerov