cdlib icon indicating copy to clipboard operation
cdlib copied to clipboard

algorithms.leiden() breaks when the input networkx graph's nodes are neither strings nor ints

Open shashank025 opened this issue 1 year ago • 1 comments

Describe the bug Similar to issue #241 and with the same root cause. The existing code only handles int and string type nodes, but networkx nodes can also be other Python hashable types.

To Reproduce Steps to reproduce the behavior:

  • CDlib version: 0.4.0
  • Operating System: MacOs Sonoma 14.5 (23F79)
  • Python version: 3.12
  • Version(s) of CDlib required libraries: NA

Script (test.py) to repro the issue:

import networkx as nx
import igraph as ig

from cdlib import algorithms

class Node:
  def __init__(self, id, type):
    self.id = id
    self.type = type

  def __hash__(self):
    return hash(self.id)

  def __eq__(self, other):
    return self.id == other.id

john = Node('John Travolta', 'actor')
nick = Node('Nick Cage', 'actor')
face_off = Node('Face Off', 'movie')

G = nx.Graph()
G.add_node(john)
G.add_node(nick)
G.add_edge(john, face_off, label='ACTED_IN')
G.add_edge(nick, face_off, label='ACTED_IN')

clusters = algorithms.leiden(G)

This fails with:

Traceback (most recent call last):
  File "/Users/shashankr/projects/cdlib/test.py", line 28, in <module>
    clusters = algorithms.leiden(G)
               ^^^^^^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/algorithms/crisp_partition.py", line 632, in leiden
    return NodeClustering(
           ^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/classes/node_clustering.py", line 31, in __init__
    super().__init__(communities, graph, method_name, method_parameters, overlap)
  File "/Users/shashankr/projects/cdlib/cdlib/classes/clustering.py", line 42, in __init__
    communities = self.__convert_back_to_original_nodes_names_if_needed(communities)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/classes/clustering.py", line 21, in __convert_back_to_original_nodes_names_if_needed
    to_return.append([int(x[1:]) for x in com])
                      ^^^^^^^^^^
ValueError: invalid literal for int() with base 10: '<__main__.Node object at 0x100cdf5c0>'

Expected behavior

Invoking the above script (python test.py) should really just work, and not fail

Screenshots NA

Additional context

I'm hoping the fix for this is similar.

shashank025 avatar Jun 01 '24 03:06 shashank025

See answer to #241

GiulioRossetti avatar Jun 01 '24 11:06 GiulioRossetti