rapids-single-cell-examples
rapids-single-cell-examples copied to clipboard
Leiden CLustering takes too long
Hello @rilango @cjnolet @mlgill @raydouglass @pentschev,
I am trying to run leiden clustering on ~40 M cells. During the run I see the GPU is idle in terms of power usage and is forever to perform the leiden clustering. It takes time to execute the code below. Could I please know how can I sort out the issue?
rsc.tl.leiden(adatafilt)
->
def leiden (args):
->
g = _create_graph(adjacency, use_weights)
->
`def _create_graph(adjacency, use_weights=True): from cugraph import Graph
sources, targets = adjacency.nonzero()
weights = adjacency[sources, targets]
if isinstance(weights, np.matrix):
weights = weights.A1
df = cudf.DataFrame({"source": sources, "destination": targets, "weights": weights})
g = Graph()
with warnings.catch_warnings():
warnings.simplefilter("ignore")
if use_weights:
g.from_cudf_edgelist(
df, source="source", destination="destination", weight="weights"
)
else:
g.from_cudf_edgelist(df, source="source", destination="destination")
return g`
Takes forever to execute the below line
g.from_cudf_edgelist(df, source="source", destination="destination", weight="weights")