RTX-KG2
RTX-KG2 copied to clipboard
PMIDs in KG2
From AHM on 4/27, @edeutsch asked for information on PMIDs. How many of our sources are extracting PMID data and are they making it into KG2 edges?
I am looking into it, but also want to check with @saramsey if he might know.
I ran a query in KG2c to get an idea of how many non-semmeddb sources we get PMIDs from:
match (n)-[e]->(m) where n.publications is not null and not "infores:semmeddb" in e.knowledge_source return distinct e.knowledge_source, count(distinct e) order by count(distinct e) desc
e.knowledge_source | count(distinct e) |
---|---|
["infores:pathwhiz"] | 3975444 |
["infores:drugbank"] | 2253707 |
["infores:ensembl-gene"] | 1331738 |
["infores:hmdb"] | 713751 |
["infores:hmdb", "infores:pathwhiz"] | 682382 |
["infores:diseases"] | 594581 |
["infores:intact"] | 267449 |
["infores:disgenet"] | 252930 |
["infores:goa"] | 215042 |
["infores:pr"] | 143262 |
["infores:ncit"] | 119654 |
["infores:umls-metathesaurus"] | 119355 |
["infores:reactome"] | 86940 |
["infores:mesh"] | 85789 |
["infores:chebi"] | 66488 |
["infores:omim"] | 64684 |
["infores:kegg"] | 49562 |
["infores:diseases", "infores:disgenet"] | 44476 |
["infores:dgidb"] | 42807 |
["infores:uniprot"] | 33991 |
["infores:chembl"] | 30207 |
["infores:drugcentral"] | 26789 |
["infores:go", "infores:go-plus"] | 23698 |
["infores:go"] | 23439 |
["infores:loinc-umls"] | 22087 |
["infores:ncbi-gene", "infores:ensembl-gene", "infores:pr", "infores:uniprot"] | 19511 |
["infores:chebi", "infores:go-plus"] | 12105 |
["infores:go-plus"] | 11756 |
["infores:ordo"] | 10864 |
["infores:fma-umls"] | 9626 |
["infores:disease-ontology"] | 9163 |
["infores:uberon"] | 9093 |
["infores:mondo"] | 9020 |
["infores:mondo", "infores:efo"] | 8762 |
["infores:rxnorm"] | 7302 |
["infores:drugbank", "infores:drugcentral"] | 6603 |
["infores:efo"] | 6195 |
["infores:hpo"] | 5075 |
["infores:fma-obo", "infores:fma-umls"] | 4192 |
["infores:atc-codes-umls"] | 2913 |
Thank you, Amy!
Great! I'm surprised!
I just did a query for pathways and got this:
https://arax.ncats.io/?r=40441
and see publications associated with a non-SemMedDB edge, so this is great!
Wow, that was more than I expected. Thank you for the analysis, @amykglen