Erica Wood comments

Results 183 comments of


                                            Erica Wood

New Node Categorization System For Ontologies

The first step in this process is to identify the number of nodes in KG2c that are categorized from an ontology. (Based on discussion with @amykglen)

New Node Categorization System For Ontologies

While we didn't implement a completely new system, there is a much clearer node categorization system for ontologies that is easier to edit. Closing this issue

drug_regulatory_status_world_wide is not a valid predicate

It is in Biolink, and it is mapped to our RepoDB mappings in there: ![image](https://github.com/user-attachments/assets/81f1cc99-cc02-4ce8-b1c3-da810e021446) I suspect that there was confusion when creating the mappings (which was done earlier this...

3,852 interactions from DGIdb could not be mapped

Looking into this again, the interaction breakdown has changed significantly: ``` { "GuideToPharmacology": 8618, "DrugBank": 2965, "NCI": 2798, "TdgClinicalTrial": 1186, "PharmGKB": 914, "JAX-CKB": 601, "CIViC": 211, "TEND": 120, "TALC": 109,...

Remove Clinical Trials Information from KG2

#398 will replace this information

Remove Clinical Trials Information from KG2

Based on ```match (n:`biolink:RetrievalSource`) where not n.id =~ 'umls_.*' and not n.id =~ 'OBO:.*' return n.id, n.name order by n.id;``` RepoDB is not in KG2.10.1pre. Further, ``` match (n)-[e]->(m) where...

Code Cleanup

I am considering removing the `knowledge_type` field from `kg2-provided-by-curie-to-infores-curie.yaml`, which stemmed from #77 but have never been used (and seem out of date now that we have `knowledge_level` and `agent_type`).

Code Cleanup

I also need to check `curies-to-urls-map.yaml`. It seems strangely full.

Code Cleanup

Also included in this should be items in multi ont/kg2_util that were removed with the separate ETL of UMLS

implement long-term fix for neo4j-admin import buffer overflow issue (work around the 2019 hack)

In some of the edges, the size of the publications info field is 8 digits large. To combat this, I created the function "limit_publication_info_size" in json_to_tsv.py. This limits the number...