squidpy icon indicating copy to clipboard operation
squidpy copied to clipboard

omnipath geneID

Open wangjiawen2013 opened this issue 3 years ago • 2 comments
trafficstars

Hi, Each time I use sq.gr.ligrec() to perform receptor-ligand analysis, squidpy will re-download the interaction database. sometimes it takes a long time to download because of the poor internet. So I downloaded omnipath database using omnipath.interactions.import_intercell_network() and saved it on my computer. then read it with pandas and import it into sq.gr.ligrc, but the geneID is uppercase in omnipath, but geneID in squidpy adata is lowercase. so an error occurred: ValueError: After filtering by genes, no interactions remain.

So, how to save omnipath database with the proper geneID ? here are geneID in omnipath databse: source ... plasma_membrane_peripheral_intercell_target 0 P0DP25 ... False 1 P0DP25 ... False 2 P0DP25 ... False 3 P0DP25 ... False 4 P0DP25 ... False

and here are IDs in adata var_names:Index(['Abcc4', 'Acp5', 'Acvr1', 'Acvr2a', 'Adora2b', 'Afp', 'Ahnak', 'Akr1c19', 'Alas2', 'Aldh1a2', ... 'Wnt2b', 'Wnt3', 'Wnt3a', 'Wnt5a', 'Wnt5b', 'Wnt8a', 'Xist', 'Zfp444', 'Zfp57', 'Zic3'], dtype='object', length=351)

wangjiawen2013 avatar Mar 26 '22 03:03 wangjiawen2013

Hi @wangjiawen2013 ,

squidpy will re-download the interaction database. sometimes it takes a long time to download because of the poor internet. So I downloaded omnipath database using omnipath.interactions.import_intercell_network() and save

This shouldn't really happen, as we defer the download to omnipath, which by default caches the results (depending on passed arguments) to disk. That is:

omnipath.interactions.import_intercell_network()
omnipath.interactions.import_intercell_network()  # will re-use the cached results
omnipath.interactions.import_intercell_network(some_param=...)  # will re-download, since the parameter can change the resut

sq.gr.ligrc, but the geneID is uppercase in omnipath, but geneID in squidpy adata is lowercase. so an error occurred:

We use genesymbol_intercell_source and genesymbol_intercell_target columns in the dataframe from omnipath, which must be saved as source, target columns, respectively. I am including a snippet how you can use it:

import squidpy as sq
import omnipath as op

adata = sq.datasets.visium_fluo_adata_crop()
cluster_key = 'cluster'

df = op.interactions.import_intercell_network(
    transmitter_params={"categories": "ligand"},
    receiver_params={"categories": "receptor"}
)
# rename source/target
df['source'] = df['genesymbol_intercell_source']
df['target'] = df['genesymbol_intercell_target']

sq.gr.ligrec(adata, cluster_key=cluster_key, interactions=df)
res = adata.uns[f"{cluster_key}_ligrec"]

@giovp might be a good idea to add a small example in https://github.com/theislab/squidpy_notebooks

michalk8 avatar Mar 26 '22 12:03 michalk8

Hi, what is "source" and target ?
protein A (source) interacts with protein B(targets) protein B (source) interacts with protein A(targets) are they the same ? Does "source" corresponds to "ligand" and "target" corresponds to "receptor" ?

wangjiawen2013 avatar May 27 '22 02:05 wangjiawen2013

hi @wangjiawen2013 ,

indeed I believe you are correct wrt to source and target. WIll Close this but please feel free to reopen

giovp avatar Oct 18 '22 12:10 giovp