cugraph icon indicating copy to clipboard operation
cugraph copied to clipboard

[BUG]: cugraph.force_atlas2's pos_list for initial vertex positions

Open JOHW85 opened this issue 1 year ago • 0 comments

How does pos_list work for Force Atlas 2?

I'm trying to specify the initial positions of the nodes for Force Atlas 2 based on their community. I've already run some community algorithm to get the cluster of each node. However, I'm getting some error messages.

I've replicated the error message with the following code. Here, there are three clusters randomly assigned to 20 nodes. Each cluster will share the same x coordinate and a different y coordinate.

# Create toy graph
import networkx as nx
import pandas as pd

# Create a Graph
gdf = pd.DataFrame()

# Add some edges
gdf['src'] = [0, 0, 0, 1, 1, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]
gdf['dst'] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
nodes = pd.DataFrame(gdf['src'].append(gdf['dst']).unique(), columns=['vertex'])

# Create a Graph
G = nx.DiGraph()
G.add_edges_from(gdf.values)

print("G has", G.number_of_nodes(), "nodes and", G.number_of_edges(), "edges")  

# Create pos_list dataframe with the x and y coordinates of the nodes that are clustered together based on the Cluster column value
pos_list = pd.DataFrame(columns=['x', 'y'], index=nodes.index)
# Make three clusters. Randomly assign the nodes to the clusters
import random
nodes['cluster'] = pd.Series([random.randint(0, 2) for i in range(len(nodes))] , index=nodes.index)

# Iterate over the clusters and assign the initial positions to the nodes
import math
from itertools import chain
import numpy as np

for cluster in sorted(nodes['cluster'].unique()):
    # Get the nodes in the cluster
    cluster_nodes = nodes[nodes['cluster'] == cluster]
    # Calculate the number of nodes in the cluster
    num_nodes = len(cluster_nodes)
    # Calculate the number of rows and columns for the grid
    num_rows = int(math.sqrt(num_nodes))
    num_cols = math.ceil(num_nodes / num_rows)
    # Create a grid of positions for the nodes
    grid = np.array(list(chain.from_iterable([(i, j) for j in range(num_cols)] for i in range(num_rows))))
    # Assign the grid positions to the nodes
    cluster_nodes.loc[:, 'x'] = grid[:num_nodes, 0] + cluster*len(cluster_nodes)
    cluster_nodes.loc[:, 'y'] = grid[:num_nodes, 1] + cluster*len(cluster_nodes)

    # Update the pos_list with the initial positions of the nodes
    for i in cluster_nodes.index:
        pos_list.loc[cluster_nodes['vertex'][i],'x'] = float(cluster_nodes.loc[i, 'x'])
        pos_list.loc[cluster_nodes['vertex'][i],'y'] = float(cluster_nodes.loc[i, 'y'])

# Create a 'vertex' column using the index of the pos_list dataframe. Documentation doesn't mention vertex is needed, but without it, it gives the following error: Errors with KeyError: 'vertex'
pos_list['vertex'] = pos_list.index

# Convert gdf to cudf
gdf = cudf.DataFrame(gdf)

# Convert the networkx graph G to a cugraph graph
cugraph_graph = cugraph.Graph()
cugraph_graph.from_cudf_edgelist(
    gdf,
    source="src",
    destination="dst",
)

ITERATIONS=500
THETA=1.0
OPTIMIZE=True
pos_gdf = cugraph.layout.force_atlas2(cugraph_graph,
                                      max_iter=ITERATIONS,
                                      pos_list=pos_list,
                                      outbound_attraction_distribution=True,
                                      lin_log_mode=True,
                                      edge_weight_influence=1.0,
                                      jitter_tolerance=1.0,
                                      barnes_hut_optimize=OPTIMIZE,
                                      barnes_hut_theta=THETA,
                                      scaling_ratio=2.0,
                                      strong_gravity_mode=True,
                                      prevent_overlapping=False,
                                      gravity=1.0,
                                      verbose=False,
                                      callback=None)

I'm getting the error AttributeError: 'Series' object has no attribute '__cuda_array_interface__' at force_atlas2_wrapper.pyx:97 Setting pos_list=None (like all the examples out there) works as expected.

Code of Conduct

  • [X] I agree to follow cuGraph's Code of Conduct
  • [X] I have searched the open issues and have found no duplicates for this question

JOHW85 avatar Feb 08 '24 07:02 JOHW85