pygraphistry
pygraphistry copied to clipboard
Dev/dendrogram
adds get_dendrogram_edges and writes tests for plugins/compute/cluster
Doesn't cuml/cugraph have algos like agglomerative & hierarchical clustering? I had users before where we used that and I think we used cugraph to hit their scale
So I guess more specifically:
- For bigger graphs, can we have a GPU mode?
- inputs: I think in that case we wanted to control community count or iterations, I'm sure the algos show other typical ones
- for output, can we include something like per-level community labels?
I can see a gpu version -- but thinking about scale -- these types of graphs are likely O(6) and less (ie they divide big graphs into lobes of smaller ones).
- for output i'd like more -- for example Dendrogram function (not used) outputs colors etc, but representation is harder to make a g plot from.
- it would be useful to ravel in some data about previous graph (nodes).
- likewise is there a way to include edge info from the previous graph?
I recall 10M graphs being slow with other methods, if you're finding fast, then we can pass. Just given we generally support cudf + cugraph out-of-the-box, and they have hierarchical methods, strange to skip here.
My comments wrt output are more about what's minimally useful for consumers. As long as we expose hierarchy as enriching attribs, then downstream calls can do what you wrote, afaict.
The past use was basically "Do a hierarchy to get divions of X size / Y deep", which gave an "infinitely" zoomable graph. Think tiered network-of-network, where you can click into different tiers. By enriching each node/edge with hierarchy info, was able to compute network-of-network for different tier levels.