document_cluster
document_cluster copied to clipboard
I have clustered crime articles based on crime category. But now I want to see which article belongs to which category ?
Please give some hints or direction to perform that . I have followed your code. here I am attaching my data sheet. classification.xlsx
Could you just try this as demonstrated in the notebook? Did you run into errors?
from __future__ import print_function
print("Top terms per cluster:")
print()
#sort cluster centers by proximity to centroid
order_centroids = km.cluster_centers_.argsort()[:, ::-1]
for i in range(num_clusters):
print("Cluster %d words:" % i, end='')
for ind in order_centroids[i, :6]: #replace 6 with n words per cluster
print(' %s' % vocab_frame.ix[terms[ind].split(' ')].values.tolist()[0][0].encode('utf-8', 'ignore'), end=',')
print() #add whitespace
print() #add whitespace
print("Cluster %d titles:" % i, end='')
for title in frame.ix[i]['title'].values.tolist():
print(' %s,' % title, end='')
print() #add whitespace
print() #add whitespace
print()
print()