PheKnowLator
PheKnowLator copied to clipboard
**DON'T MERGE -- ARCHIVE ** Build overhaul v4.0.0
🛑 DO NOT USE THIS BRANCH OR MODIFY THIS PR -- CONTENT IS KEPT FOR NOTES 🛑
Purpose
This PR addresses several issues and overhauls many aspects of the current build, which is described in more detail below. The primary changes made impact the amount, type, and storage of metadata at both the node- and triple-level.
Issues Addressed by PR
- #97
- #99
- #107
- #114
Scripts Impacted
-
owlnets.py
.- Updated to fix the prior bad assumption about classes and axioms built using
UnionOf
constructors
- Updated to fix the prior bad assumption about classes and axioms built using
-
metadata.py
- Get new functionality for processing Biolink types
-
edge_list.py
- Get new functionality for adding Bioregistry identifiers
-
utils/data_utils.py
- functions created to facilitated with API-access to BioLink and Bioregistry
Data Sources/Documentation Impacted
-
edge_source_list.txt
- Added back
chemical-rna
edge data
- Added back
-
resource_info.txt
- Updated metadata for many of the edges, most often in an effort to soften the initial formatting that was applied to the data (i.e., having a more liberal and inclusive build, but providing the user with the ability to enforce specific filtering choices)
- added back information for the
chemical-rna
edge
Notebooks Impacted
-
OWLNETS_Example_Application.ipynb
-
Data_Preparation.ipynb
Output Impacted
- All output files will be g-zipped in order to improve resource use
Other Updates
- The following Wiki pages have been udated:
-
v2-Data-Sources
- Updated to included better descriptions
-
KG Construction
- Section describing the KG output has been updated to note that all output are g-zipped
-
OWL-NETS 2.0
- Section describing the KG output has been updated to note that all output are g-zipped
-