PheKnowLator icon indicating copy to clipboard operation
PheKnowLator copied to clipboard

Simplify input files -- input yaml

Open callahantiff opened this issue 1 year ago • 5 comments

Use a single yaml to organize input ontology and edge sources as well as to specify the information that is needed to parse them. This would enable the replacement of the following files: ontology_source_list.txt, edge_source_list.txt, and resource_info.txt. It would also completely remove the data ingest class and remove the need for the user input automation script.

Consider a similar approach and/or (even better) including additional arguments for instructions on metadata processing.

callahantiff avatar Aug 10 '22 00:08 callahantiff

Also add a header key that provides definitions for all required input parameters.

callahantiff avatar Aug 10 '22 00:08 callahantiff

Similarly related -- use the info here, like the prefix used for nodes to update the edge_source_metadata.txt files. As well as creating named graphs or incorporating relevant information into the edgelist metadata

callahantiff avatar Sep 08 '22 15:09 callahantiff

Make sure that we are getting the most up-to-date data from each source, including our queries

callahantiff avatar Sep 09 '22 14:09 callahantiff

Zip all data files to reduce storage space

callahantiff avatar Sep 09 '22 14:09 callahantiff

Verify the gene and disease identifier mappings for genes, disease, and chemicals. Some weirdness is happening.

callahantiff avatar Sep 09 '22 15:09 callahantiff