ENVI icon indicating copy to clipboard operation
ENVI copied to clipboard

How can I use other algorithms (e.g., UMAP, PHATE) to do the dimensional reduction?

Open zyll123 opened this issue 1 year ago • 0 comments

Hi! I am wondering which variable represent the distance matrix that can be used in other dimensional reduction algorithms? I checked the codes and tried to use the affinity matrix (kernel), but it dosen't seem to be right.

calculation of kernel:

data=np.reshape(st_data.obsm['COVET_SQRT'], [st_data.obsm['COVET_SQRT'].shape[0], -1])
k=30
nbrs = sklearn.neighbors.NearestNeighbors(n_neighbors=int(k), metric='euclidean', n_jobs=5).fit(data)
kNN = nbrs.kneighbors_graph(data, mode='distance')
# Adaptive k 
adaptive_k = int(np.floor(k / 3))
nbrs = sklearn.neighbors.NearestNeighbors(n_neighbors=int(adaptive_k),
                        metric='euclidean', n_jobs=5).fit(data)
adaptive_std = nbrs.kneighbors_graph(data, mode='distance').max(axis=1)
adaptive_std = np.ravel(adaptive_std.todense())
# Kernel
x, y, dists = scipy.sparse.find(kNN)
# X, y specific stds 
dists = dists / adaptive_std[x]
N = data.shape[0]
W = scipy.sparse.csr_matrix((np.exp(-dists), (x, y)), shape=[N, N])
# Diffusion components
kernel = W + W.T 

calculation of UMAP

st_data.obsp['ME_D_mat'] =kernel
# number of neighbors is set to construct ME graph
n_neighbors=200
sc.pp.neighbors(st_data, n_neighbors=n_neighbors)
kernel = kernel.toarray()

knn_indices, knn_dists, forest = sc.neighbors.compute_neighbors_umap(kernel, n_neighbors=n_neighbors, metric='precomputed' )
st_data.obsp['distances'], st_data.obsp['connectivities'] = sc.neighbors._compute_connectivities_umap(
    knn_indices,
    knn_dists,
    st_data.shape[0],
    n_neighbors, # change to neighbors you plan to use
)
# set the ME graph's associated information (connectivity matrix, distance matrix) to neighbors_COVET
st_data.uns['neighbors_COVET'] = st_data.uns['neighbors'].copy()
sc.tl.umap(st_data,neighbors_key='neighbors_COVET')
st_data.obsm['X_umap_COVET'] = st_data.obsm['X_umap']
sc.tl.leiden(st_data,neighbors_key='neighbors_COVET',key_added='leiden_COVET')
sc.pl.embedding(st_data,basis='X_umap_COVET',color='leiden_COVET')

the result UAMP graph is very different from the FDL algorithm.

zyll123 avatar Nov 07 '24 10:11 zyll123