pygraphistry
pygraphistry copied to clipboard
[BUG] - igraph interface for pagerank changed, removing personalized option
Describe the bug
igraph pagerank (using personalized param) code that worked in Graphistry 2.40.46, no longer works in v2.40.55 - appears that
Need to pin the version of igraph that the code is compatible with
To Reproduce The following pagerank interface is no longer supported in igraph v0.10.4, there's a new API for igraph personalized_pagerank:
e.g.
`g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})`
def personal_pagerank_on_goal(g, goal_col='goal'):
g2 = g.edges(g._edges).nodes(g._nodes)
personalization = pd.DataFrame({'vertex': g2._nodes[g2._nodes[goal_col]][g._node]}).assign(values=1.0)
g2 = g2.compute_igraph('pagerank',params={'personalization': personalization})
low_nodes = g2._nodes[g2._nodes.pagerank < g2._nodes.pagerank.median()]
g3 = g2.drop_nodes(low_nodes[g._node])
g3 = g3.compute_igraph('louvain', directed=False, out_col='journey_community')
merged_nodes = g2._nodes.merge(g3._nodes[[g._node, 'journey_community']], how='left', on=g._node)
merged_nodes['journey_community'] = merged_nodes['journey_community'].fillna(-1)
g4 = g2.nodes(merged_nodes)
g4 = g4.nodes(g4._nodes.assign(pagerank=g4._nodes.pagerank.fillna(0.0)))
return g4
@DataBoyTX Can we do a version sniff? We don't get to control what version of igraph regular pygraphistry users are on
if algorithim == 'ppr':
if igraph.__version__ < xyz:
...
else:
...
I don't think this was a major igraph version bump, and sounds recentish, so maintaining compatibility seems worth it
Edit: ignore this in favor of immediately following comment https://github.com/graphistry/pygraphistry/issues/554#issuecomment-2207181192
After internal discussion:
- this is a good time to add ppr and any other new igraph bindings
- we can give a cleaner error message if they do not exist in the user's currently installed igraph version: catch and rethrow the exn
- for legacy igraph users, whether old igraph or our old ppr form, we can reroute to new ppr with a depreciation warning, and if old igraph, stay in old form but still note the deorecation
Reviewing a bit more, I think we just need to:
- expose
personalized_pagerankas part of thecompute_igraphoptions: https://github.com/graphistry/pygraphistry/blame/53448d4ef153fd262466087a951bc28a44c8fadf/graphistry/plugins/igraph.py#L267 - add to the examples for consistency w/ compute_cugraph examples
- update graphistry + gak repos to latest pyg
Research:
- igraph never supported
pagerank(personalization=...)wherepersonalizationis a vertex weights df- see blame: https://github.com/igraph/python-igraph/blame/9dd5a58e52617c6b727b7b4c1c26642752241e32/src/igraph/structural.py#L41
- cugraph does: https://docs.rapids.ai/api/cugraph/stable/api_docs/api/cugraph/cugraph.pagerank/
- ... which matches networkx: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.link_analysis.pagerank_alg.pagerank.html#pagerank
- igraph does support
personalized_pagerank(reset=...), which does what we want- sidenote: internally, igraph implements
pagerank(...)aspersonalized_pagerank(..., reset=None)
- sidenote: internally, igraph implements
note, meanwhile as a workaround, users may be able to do:
# graph with nodes and edges
df = pd.DataFrame({
's': ['a', 'b', 'c', 'd', 'd'],
'd': ['b', 'c', 'd', 'a', 'e']
})
g1 = graphistry.edges(df, 's', 'd').materialize_nodes()
# new graph where nodes have added column 'ppr'
g2 = g1.nodes(
g1._nodes.assign(
ppr=g1.to_igraph().personalized_pagerank(reset_vertices=['b']))
)
# ex
g2._nodes
#id ppr
#0 a 0.096360
#1 b 0.313812
#2 c 0.266740
#3 d 0.226729
#4 e 0.096360