RTX-KG2 icon indicating copy to clipboard operation
RTX-KG2 copied to clipboard

Build KG2.8.6

Open ecwood opened this issue 2 years ago • 2 comments

1. Build and load KG2:
  • [x] Clear the instance using bash -x clear-instance.sh
  • [x] Clone the RTX repo from Github git clone https://github.com/RTXteam/RTX-KG2.git
  • [x] Setup the KG2 build system bash -x RTX-KG2/setup-kg2-build.sh
  • [x] Check ~/kg2-build/setup-kg2-build.log to ensure setup completed successfully
  • [x] Run a dry build using bash -x ~/kg2-code/build-kg2-snakemake.sh all -F -n
  • [x] Check ~/kg2-build/build-kg2-snakemake-n.log to ensure all rules are included
  • [x] Run touch ~/kg2-build/minor-release for a minor release or touch ~/kg2-build/major-release for a major release. If you don't want to change the version number, ignore this step.
  • [x] Initiate a screen session screen -S buildkg2
  • [x] Start the build bash -x ~/kg2-code/build-kg2-snakemake.sh all -F
  • [x] Verify build completed by checking ~/kg2-build/build-kg2-snakemake.log
  • [x] Check the build version number in ~/kg2-build/kg2-version.txt
  • [ ] Check report file kg2-simplified-report.json; compare against previous kg2-simplified-report.json to identify any major changes
  • [ ] Generate nodes.tsv and edges.tsv by running python3 kg2_json_to_kgx_tsv.py kg2-simplified.json
  • [ ] Generate content-metadata.json on build instance
  • [ ] Push nodes.tsv and edges.tsv to public S3 bucket with aws s3 /file/name s3://rtx-kg2-public
  • [ ] Find an available kg2endpoint by checking rtx.ai under Networking on Lightsail
  • [ ] install the new KG2 TSV files into Neo4j on the kg2endpoint
  • [ ] Update code on kg2endpoint, then run setup-kg2-neo4j.sh if necessary
  • [ ] Load KG2 into Neo4J RTX-KG2/tsv-to-neo4j.sh > ~/kg2-build/tsv-to-neo4j.log 2>&1
  • [ ] Update kg2-versions.md
  • [ ] Update version numbers of upstream knowledge sources, for the new version of KG2 in kg2-versions.md (see Cypher command below).

Example Cypher to get versions of many of the knowledge sources in a specific build of KG2pre:

match (n:`biolink:RetreivalSource`) where not n.id =~ 'umls_.*' and not n.id =~ 'OBO:.*' return n.id, n.name order by n.id;

ecwood avatar Sep 08 '23 18:09 ecwood

Source predicate curie is missing from the YAML config file: ORPHANET:C057
Source predicate curie is missing from the YAML config file: RO:0002428
Source predicate curie is missing from the YAML config file: ORPHANET:C056
Source predicate curie is missing from the YAML config file: DrugCentral:reduce_risk

ecwood avatar Sep 09 '23 21:09 ecwood

@acevedol Looks like some of the checkboxes could be checked here (e.g., Neo4j endpoint); can we please get a status refresh on the checklist? Thanks.

saramsey avatar Oct 17 '23 16:10 saramsey