monocle-release icon indicating copy to clipboard operation
monocle-release copied to clipboard

Why clusterCells differ based on different PCs used in tSNE

Open yanwengong opened this issue 4 years ago • 0 comments

On the same data, I tried run tSNE with 10PCs and 6PCs followed by clusterCells. I found the clustering results are different. I wonder is it because "clusterCells" used the number of PCs used in generating tSNE, or the "clusterCells" utilized tSNE as part of its information? Initially, I thought dimension reduction (generating tSNE) is independent with clustering.

I used monocle 2 and the code is below:

use 6PCs

data <- reduceDimension(data, max_components = 2, num_dim = 6, reduction_method = 'tSNE', verbose = T) data <- clusterCells(data, num_clusters = 3) cluster_info_dim6 <- pData(data)$Cluster

use 10PCs

data <- reduceDimension(data, max_components = 2, num_dim = 10, reduction_method = 'tSNE', verbose = T) data <- clusterCells(data, num_clusters = 3) cluster_info_dim10 <- pData(data)$Cluster

identical(cluster_info_dim6, cluster_info_dim10) # this returns FALSE

yanwengong avatar Apr 15 '20 21:04 yanwengong