rainette icon indicating copy to clipboard operation
rainette copied to clipboard

Select clusters containing more than a certain number of segments

Open gabrielparriaux opened this issue 1 year ago • 4 comments

Hi @juba,

After having done a Rainette clustering, I often execute a Correspondence Analysis with lexicon and clusters.

In that case, very small clusters tend to pull the plot to the extremes, making it difficult to read.

So I’m looking for a way to select and isolate the clusters that contain a very small number of segments.

In some way, I need to build a vector with the names of the clusters that contain less than a certain number of segments.

I have looked at the documentation available but have no idea of how to do it.

Can you help me and put me on the way?

Thanks a lot for your help!

Gabriel

gabrielparriaux avatar Feb 03 '24 09:02 gabrielparriaux

This is not directly related to rainette. You have to compute the size of the clusters and filter out the smaller ones. Something like:

tab <- table(clusters)
names(tab)[tab > min_size]

juba avatar Feb 04 '24 15:02 juba

Thanks a lot for your help and sorry to ask a question not directly related to rainette… 😰

Just a question: is there an object “clusters” that I can use to compute the size of each cluster? I can’t find it in the docs.

I had an idea of computing the size of each cluster with something like this:

clusters <- clusters_by_doc_table(dtm_for_analysis, clust_var = "Cluster")
sum(clusters$clust_1)
…

But, then I should have a loop to do it for each cluster in the clustering… and I think maybe there is something simpler?

Sorry if it’s an obvious question…

gabrielparriaux avatar Feb 06 '24 11:02 gabrielparriaux

If you're looking for the size of each cluster in terms of number of segments, then doing the following should be enough:

clusters <- cutree(res, k = 5)
table(clusters)

Or in your example:

table(dtm_for_analysis$Cluster)

juba avatar Feb 06 '24 14:02 juba

Much easier like this 😬. Thanks a lot for helping, this is exactly what I needed!

gabrielparriaux avatar Feb 07 '24 09:02 gabrielparriaux