goci
goci copied to clipboard
Investigate and document data publisher step of data release
The data publisher step (part of Virtuoso step) in the DR is time and memory consuming and causes the DR to break if it exceeds the specified limits. This is fixed for now by increasing the limits (24h, 32GB), but will continue to hurt us in future as the number of traits and associations grows.
This step generates an OWL file gwas-kb.nightly.owl.
This needs investigating to ascertain
- is this step is only useful for the V1 Diagram?
- Is there any other part of the infrastructure that uses this OWL file?
- Would it work to mute this and every other operation happening in the Virtuoso step, and leave only owl file download from EFO?