hierarchicalSets icon indicating copy to clipboard operation
hierarchicalSets copied to clipboard

Getting non-overlapping sets under the same cluster

Open ale0xb opened this issue 4 years ago • 0 comments

Hello,

I've been playing around with HierarchicalSets and some toy data (see attachment).

The data is loaded from a JSON file and fed into format_sets() like this:

data <- fromJSON(file = "data.json")

universe <- unique(unlist(data))

x <- lapply(data, function(set) {
  universe %in% set
})

setNames = names(x)

colnames(sets) <- setNames
rownames(sets) <- universe

formatted_set <- format_sets(sets)
dataSet <- create_hierarchy(formatted_set)

Then I check the cluster results with: cluster_members(keySet)

Looking at the output, I see that the algorithm clusters together "el-1272" and "el-2132". However, they do not have any elements in common.

Is this expected behavior? If I understood correctly, this should not happen as the homogeneity between these two sets will always be 0.

Can anyone help me to find out what's going on? Thanks

data.json.zip

ale0xb avatar Jun 17 '20 17:06 ale0xb