pypath
pypath copied to clipboard
Translating identifiers
Describe the question Can we reach all the identifier type list to use as source or target id? For example, can we translate drug ids with their alternatives?
Can we get both source and target identifiers as output of mapping tool, for example as a dictionary, instead of only target identifier as a list?
Desktop (please complete the following information): OS: Windows Python version: 3.8 Version or commit hash v0.13.13
That's a very good question, it would be super useful to access these at a single point. This is not possible at the moment, but I can easily implement it.
Hi Elif, I've just added two new methods to utils.mapping.Mapper
: the mapping_tables
returns a list of available ID translation tables and the id_types
returns a list of identifier types. Use them like this:
from pypath.utils import mapping
mapper = mapping.get_mapper()
mapper.mapping_tables()
mapper.id_types()
Best,
Denes
One more thing: loading any ID translation table for the very first time depends on the download speed, for UniProt or BioMart data it's typically a few seconds or couple of minutes, but in some other cases might take even half an hour. Later, loading the table from disk is fast, and if you used a table in the past 5 minutes, it remains loaded, making subsequent lookups very fast.
Hi Denes,
When we use mapping module for multiple source ids, we only get target ids. For example with mapping.map_names(protein_list, 'uniprot', 'interpro'), we get a list of domains that are located in the given protein list. Instead of this, can we get protein specific domain ids like {protein: ["domain1", "domain2",...]} because we couldn’t get the information of which uniprot id matches with these domain ids.
Hi Elif,
Sure, you can use a dict comprehension for that:
from pypath.utils import mapping
uniprots = ['P00533', 'O75385']
domains = dict((u, mapping.map_name(u, 'uniprot', 'interpro')) for u in uniprots)
domains
# {'O75385': {'IPR022708', 'IPR011009', 'IPR016237', 'IPR017441', 'IPR000719', 'IPR008271'},
# 'P00533': {'IPR016245', 'IPR032778', 'IPR001245', 'IPR020635', 'IPR006212', 'IPR011009', 'IPR006211', 'IPR000494', 'IPR009030', 'IPR017441', 'IPR000719', 'IPR008266', 'IPR036941'}}