PheKnowLator
PheKnowLator copied to clipboard
Simplify input files -- input yaml
Use a single yaml
to organize input ontology and edge sources as well as to specify the information that is needed to parse them. This would enable the replacement of the following files: ontology_source_list.txt
, edge_source_list.txt
, and resource_info.txt
. It would also completely remove the data ingest class and remove the need for the user input automation script.
Consider a similar approach and/or (even better) including additional arguments for instructions on metadata processing.
Also add a header key that provides definitions for all required input parameters.
Similarly related -- use the info here, like the prefix used for nodes to update the edge_source_metadata.txt
files. As well as creating named graphs or incorporating relevant information into the edgelist metadata
Make sure that we are getting the most up-to-date data from each source, including our queries
Zip all data files to reduce storage space
Verify the gene and disease identifier mappings for genes, disease, and chemicals. Some weirdness is happening.