Tangram
Tangram copied to clipboard
speedup + plotting functions
Hej @gaddamshreya,
this PR contains a few suggestions of minor changes:
faster cosine distance computing
I'd suggest using faster way to compute cosine distances in map_cells_to_space
. Currently the code is:
cos_sims = []
for v1, v2 in zip(G.T, G_predicted.T):
norm_sq = np.linalg.norm(v1) * np.linalg.norm(v2)
cos_sims.append((v1 @ v2) / norm_sq)
My proposal is to replace this with a function mat_cosine_similarity
(I placed this in the utils
module). This function uses broadcasting and njit
from numba
. This gives you a fairly big increase in speed. This is not a pivotal part of the code, but if the function is run many times (like in LOOV) it's kinda nice. The function mat_cosine_similarity
is defined as:
@njit
def mat_cosine_similarity(V1,V2, axis = 0):
n_1 = np.sum(V1 * V1,axis = axis) ** 0.5
n_2 = np.sum(V2 * V2,axis = axis) ** 0.5
norms_sq = n_1 * n_2
ewise = V1 * V2
dot_unorm = np.sum(ewise,axis =axis)
cs = dot_unorm / norms_sq
return cs
See the attached image for a comparison of time and also assertion that the two implementations produce the same results.
plotting utilities
- In
plot_genes_sc
- if thegenes
argument is a single gene (string) this no longer throws an error (i.e., you can provide either a list of multiple genes or a single gene as a string) - this makes it more convenient to plot. - In
plot_genes_sc
- I added the option to "lowercase" the genes provided viagenes
(to make them match the indices, which are lowercased). - Support for
spatial_key
in some spatial plot functions, similar to the standard thatscanpy
/squidpy
are using.
Image: