Tangram icon indicating copy to clipboard operation
Tangram copied to clipboard

speedup + plotting functions

Open almaan opened this issue 1 year ago • 0 comments

Hej @gaddamshreya,

this PR contains a few suggestions of minor changes:

faster cosine distance computing

I'd suggest using faster way to compute cosine distances in map_cells_to_space. Currently the code is:

cos_sims = []
for v1, v2 in zip(G.T, G_predicted.T):
        norm_sq = np.linalg.norm(v1) * np.linalg.norm(v2)
        cos_sims.append((v1 @ v2) / norm_sq)

My proposal is to replace this with a function mat_cosine_similarity (I placed this in the utils module). This function uses broadcasting and njit from numba. This gives you a fairly big increase in speed. This is not a pivotal part of the code, but if the function is run many times (like in LOOV) it's kinda nice. The function mat_cosine_similarity is defined as:

@njit
def mat_cosine_similarity(V1,V2, axis = 0):
    n_1 = np.sum(V1 * V1,axis = axis) ** 0.5
    n_2 = np.sum(V2 * V2,axis = axis) ** 0.5
    norms_sq = n_1 * n_2
    ewise = V1 * V2
    dot_unorm = np.sum(ewise,axis =axis)
    cs = dot_unorm / norms_sq
    return cs

See the attached image for a comparison of time and also assertion that the two implementations produce the same results.

plotting utilities

  1. In plot_genes_sc - if the genes argument is a single gene (string) this no longer throws an error (i.e., you can provide either a list of multiple genes or a single gene as a string) - this makes it more convenient to plot.
  2. In plot_genes_sc - I added the option to "lowercase" the genes provided via genes (to make them match the indices, which are lowercased).
  3. Support for spatial_key in some spatial plot functions, similar to the standard that scanpy/squidpy are using.

Image: image

almaan avatar Aug 24 '23 22:08 almaan