hierarchicalSets
hierarchicalSets copied to clipboard
Getting non-overlapping sets under the same cluster
Hello,
I've been playing around with HierarchicalSets and some toy data (see attachment).
The data is loaded from a JSON file and fed into format_sets()
like this:
data <- fromJSON(file = "data.json")
universe <- unique(unlist(data))
x <- lapply(data, function(set) {
universe %in% set
})
setNames = names(x)
colnames(sets) <- setNames
rownames(sets) <- universe
formatted_set <- format_sets(sets)
dataSet <- create_hierarchy(formatted_set)
Then I check the cluster results with:
cluster_members(keySet)
Looking at the output, I see that the algorithm clusters together "el-1272" and "el-2132". However, they do not have any elements in common.
Is this expected behavior? If I understood correctly, this should not happen as the homogeneity between these two sets will always be 0.
Can anyone help me to find out what's going on? Thanks