rapids-single-cell-examples Leiden CLustering takes too long

Leiden CLustering takes too long

Open siddharthamantrala opened this issue 6 months ago • 0 comments

Hello @rilango @cjnolet @mlgill @raydouglass @pentschev,

I am trying to run leiden clustering on ~40 M cells. During the run I see the GPU is idle in terms of power usage and is forever to perform the leiden clustering. It takes time to execute the code below. Could I please know how can I sort out the issue?

rsc.tl.leiden(adatafilt) ->

def leiden (args): ->

g = _create_graph(adjacency, use_weights) ->

`def _create_graph(adjacency, use_weights=True): from cugraph import Graph

sources, targets = adjacency.nonzero()
weights = adjacency[sources, targets]
if isinstance(weights, np.matrix):
    weights = weights.A1
df = cudf.DataFrame({"source": sources, "destination": targets, "weights": weights})
g = Graph()
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    if use_weights:
        g.from_cudf_edgelist(
            df, source="source", destination="destination", weight="weights"
        )
    else:
        g.from_cudf_edgelist(df, source="source", destination="destination")
return g`

Takes forever to execute the below line g.from_cudf_edgelist(df, source="source", destination="destination", weight="weights")

Aug 22 '24 16:08 siddharthamantrala

rapids-single-cell-examples rapids-single-cell-examples copied to clipboard

Leiden CLustering takes too long

rapids-single-cell-examples
rapids-single-cell-examples copied to clipboard