Erica Wood
Erica Wood
The first step in this process is to identify the number of nodes in KG2c that are categorized from an ontology. (Based on discussion with @amykglen)
While we didn't implement a completely new system, there is a much clearer node categorization system for ontologies that is easier to edit. Closing this issue
It is in Biolink, and it is mapped to our RepoDB mappings in there:  I suspect that there was confusion when creating the mappings (which was done earlier this...
Looking into this again, the interaction breakdown has changed significantly: ``` { "GuideToPharmacology": 8618, "DrugBank": 2965, "NCI": 2798, "TdgClinicalTrial": 1186, "PharmGKB": 914, "JAX-CKB": 601, "CIViC": 211, "TEND": 120, "TALC": 109,...
#398 will replace this information
Based on ```match (n:`biolink:RetrievalSource`) where not n.id =~ 'umls_.*' and not n.id =~ 'OBO:.*' return n.id, n.name order by n.id;``` RepoDB is not in KG2.10.1pre. Further, ``` match (n)-[e]->(m) where...
I am considering removing the `knowledge_type` field from `kg2-provided-by-curie-to-infores-curie.yaml`, which stemmed from #77 but have never been used (and seem out of date now that we have `knowledge_level` and `agent_type`).
I also need to check `curies-to-urls-map.yaml`. It seems strangely full.
Also included in this should be items in multi ont/kg2_util that were removed with the separate ETL of UMLS
In some of the edges, the size of the publications info field is 8 digits large. To combat this, I created the function "limit_publication_info_size" in json_to_tsv.py. This limits the number...