markov_clustering
markov_clustering copied to clipboard
ValueError: shape mismatch in assignment.
Just tried to run the example code given on github readme.
import markov_clustering as mc
import networkx as nx
import random
# number of nodes to use
numnodes = 200
# generate random positions as a dictionary where the key is the node id and the value
# is a tuple containing 2D coordinates
positions = {i:(random.random() * 2 - 1, random.random() * 2 - 1) for i in range(numnodes)}
# use networkx to generate the graph
network = nx.random_geometric_graph(numnodes, 0.3, pos=positions)
# then get the adjacency matrix (in sparse form)
matrix = nx.to_scipy_sparse_matrix(network)
result = mc.run_mcl(matrix)
Gives the following error:
Traceback (most recent call last):
File "/home/baqir/code/email-sentiment-analysis/algorithms/markov_clustering_visual.py", line 22, in <module>
result = mc.run_mcl(matrix)
File "/home/baqir/code/email-sentiment-analysis/env/lib/python3.6/site-packages/markov_clustering/mcl.py", line 233, in run_mcl
matrix = prune(matrix, pruning_threshold)
File "/home/baqir/code/email-sentiment-analysis/env/lib/python3.6/site-packages/markov_clustering/mcl.py", line 93, in prune
pruned[matrix >= threshold] = matrix[matrix >= threshold]
File "/home/baqir/code/email-sentiment-analysis/env/lib/python3.6/site-packages/scipy/sparse/_index.py", line 109, in __setitem__
raise ValueError("shape mismatch in assignment")
ValueError: shape mismatch in assignment
Do not understand if this is a scipy error or markov-clusterting error in passing valid arguments.
Getting the same error here... From inspecting the codes, it seems to be an error related to scipy... somehow, at the time of prunning, both the matrices involved (matrix and pruned) get correctly classified as scipy sparse matrices (isspmatrix returns true), but once you get to the assignment line and inside scipy, I guess the matrix on the right side (called x in the setitem method) returns false when asking if its a sparse matrix. From there on, as it gets treated as a numpy array, the mismatch happens.
That is what I understand at least.
I guess I only started having this error after updating scipy to version 1.3 (1.2 was the version available at the time of markov-clustering latest release), so downgrading to 1.2 seems to have made the trick for me (conda install scipy=1.2).
I havent tried it much yet.. but a couple of runs and I dont have the error prompting up...
Hope it helps.
Creating a parallel dok matrix fixes the issue, but makes the algorithm slower: In mcl.py replace prune with: `def prune(matrix, threshold): """ Prune the matrix so that very small edges are removed. The maximum value in each column is never pruned.
:param matrix: The matrix to be pruned
:param threshold: The value below which edges will be removed
:returns: The pruned matrix
"""
dok_m = matrix.todok(copy=False) # INTRODUCED BY ME TO FIX BUG WITH SCIPY>=0.13 -- DOK ALLOWS ASSIGNMENT
if isspmatrix(matrix):
pruned = dok_matrix(matrix.shape)
pruned[matrix >= threshold] = dok_m[dok_m >= threshold] # DOK ALLOWS ASSIGNMENT
pruned = pruned.tocsc()
else:
pruned = matrix.copy()
pruned[pruned < threshold] = 0
# keep max value in each column. same behaviour for dense/sparse
num_cols = matrix.shape[1]
row_indices = matrix.argmax(axis=0).reshape((num_cols,)) # NEED CSC OR CSR FOR ARGMAX
col_indices = np.arange(num_cols)
pruned[row_indices, col_indices] = dok_m[row_indices, col_indices] # DOK ALLOWS ASSIGNMENT
return pruned`
Unfortunately argmax() isn't implemented for dok matrices, so both copies need to be kept.
For anyone still having trouble with this issue: upgrading scipy up from 1.3.x seems to have fixed it for me. It's working on 1.4.1 and 1.5.1 where it would previously fail.