RTX
RTX copied to clipboard
KG2.9.2c Rollout
THE BRANCH FOR THIS ROLLOUT IS: KG2.9.2c
THE ARAX-DATABASES.RTX.AI DIRECTORY FOR THIS ROLLOUT IS: /home/rtxconfig/KG2.9.2
Prerequisites
ssh access
To complete this workflow, you will need ssh
access to:
- [x]
arax-databases.rtx.ai
- [x] the self-hosted ARAX/KG2 instance,
arax.ncats.io
(see example configuration information below) - [x] the self-hosted PloverDB instances,
kg2cploverN.rtx.ai
- [x] the self-hosted Neo4j instances for KG2c,
kg2canoncalizedN.rtx.ai
- [x] the self-hosted CI/CD instance,
cicd.rtx.ai
- [x] the webserver for downloading of the KG2c "lite" JSON file,
kg2webhost.rtx.ai
GitHub access
- [x] write access to the
RTXteam/PloverDB
project area - [x] write access to the
RTXteam/RTX
project area - [x] write access to the
ncats/translator-lfs-artifacts
project area (not critical, but needed for some final archiving steps; Amy Glen has access)
AWS access
You will need:
- [x] access to the AWS Console (you'll need an IAM username; ask Stephen Ramsey about getting one)
- [x] IAM permission to start and stop instances in EC2 via the AWS Console
- [x] access to the S3 bucket
s3://rtx-kg2/
(ask Stephen Ramsey for access)
Slack workspaces
You will also need access to the following Slack workspaces:
- [x] ARAXTeam (subscribe to #deployment)
- [x] NCATSTranslator (subscribe to `#devops-teamexpanderagent)
Example ssh config for setting up login into arax.ncats.io
:
Host arax.ncats.io
User stephenr
ProxyCommand ssh -i ~/.ssh/id_rsa_long -W %h:%p [email protected]
IdentityFile ~/.ssh/id_rsa_long
Hostname 172.31.53.16
1. Build and load KG2c:
- [x] merge
master
into the branch being used for this KG2 version (which would typically be named likeKG2.X.Yc
). Record this issue number in the merge message. - [x] update the four hardcoded biolink version numbers in the branch (as needed):
- [x] in
code/UI/OpenAPI/python-flask-server/openapi_server/openapi/openapi.yaml
(github; local) - [x] in
code/UI/OpenAPI/python-flask-server/KG2/openapi_server/openapi/openapi.yaml
(github; local) - [x] in
code/UI/OpenAPI/python-flask-server/RTX_OA3_TRAPI1.4_ARAX.yaml
(github; local) - [x] in
code/UI/OpenAPI/python-flask-server/RTX_OA3_TRAPI1.4_KG2.yaml
(github; local)
- [x] in
- [x] build a new KG2c on
buildkg2c.rtx.ai
from the branch (how-to is here)- [x] before starting the build:
- [x] make sure there is enough disk space available on
arax-databases.rtx.ai
(need at least 100G, ideally >120G). delete old KG2 database directories as needed (warn the team on Slack in advance). - [x] make sure to choose to build a new synonymizer in
kg2c_config.json
, as described in the how-to
- [x] make sure there is enough disk space available on
- [x] after the build is done, verify it looks ok:
- [x]
node_synonymizer.sqlite
should be around 8-15 GB - [x] make sure
node_synonymizer.sqlite
's last modified date is today (or whatever day the build was run) - [x] make sure
kg2c_lite.json.gz
's last modified date is today (or whatever day the build was run) - [x] the entire build runtime (synonymizer + KG2c) shouldn't have been more than 24 hours
- [x] the synonymizer and KG2c artifacts should have been auto-uploaded into the proper directory on
arax-databases.rtx.ai
(/home/rtxconfig/KG2.X.Y
)
- [x]
- [x] before starting the build:
- [x] load the new KG2c into neo4j at http://kg2-X-Yc.rtx.ai:7474/browser/ (how to is here)
- [x] verify the correct KG2 version was uploaded by running this query:
match (n {id:"RTX:KG2c"}) return n
- [x] verify the correct KG2 version was uploaded by running this query:
- [x] update
RTX/code/config_dbs.json
in the branch:- [x] update the synonymizer version number/path
- [x] update the fda_approved_drugs version number/path
- [x] update the autocomplete version number/path
- [x] update the meta_kg version number/path
- [x] update the kg2c sqlite version number/path
- [x] update the KG2pre and KG2c Neo4j endpoints
- [x] copy the
kg2c_lite_2.X.Y.json.gz
file (which you can get from the S3 buckets3://rtx-kg2/kg2c_lite.json.gz
(but CHECK THE DATE AND MD5 HASH TO BE SURE YOU ARE NOT GETTING AN OLD FILE) to the directory/home/ubuntu/nginx-document-root/
onkg2webhost.rtx.ai
- [x] load the new KG2c into Plover (how-to is here)
- [x] start the new self-hosted PloverDB on
kg2cploverN.rtx.ai
:- [x]
ssh [email protected]
- [x]
cd PloverDB && git pull origin kg2.X.Yc
- [x]
./run.sh ploverimage2.X.Y plovercontainer2.X.Y "sudo docker"
- [x]
- [x] update
config_dbs.json
in the branch for this KG2 version in the RTX repo to point to the new Plover for the 'dev' maturity level
2. Rebuild downstream databases:
The following databases should be rebuilt and copies of them should be put in /home/rtxconfig/KG2.X.Y
on arax-databases.rtx.ai
. Please use this kind of naming format: mydatabase_v1.0_KG2.X.Y.sqlite
.
- [x] NGD database (how-to is here)
- [x] refreshed XDTD database @chunyuma
- [ ] XDTD database @chunyuma (may be skipped - depends on the changes in this KG2 version)
NOTE: As databases are rebuilt, RTX/code/config_dbs.json
will need to be updated to point to their new paths! Push these changes to the branch for this KG2 version, unless the rollout of this KG2 version has already occurred, in which case you should push to master
(but first follow the steps described here).
3. Update the ARAX codebase:
All code changes should go in the branch for this KG2 version!
- [x] regenerate the KG2c test triples file in the branch for this KG2 version @acevedol
- [x] ensure the new KG2c Neo4j is currently running
- [x] check out the branch and pull to get the latest changes (this is important for ensuring the correct KG2c Neo4j is used)
- [x] run create_json_of_kp_predicate_triples.py
- [x] push the regenerated file to
RTX/code/ARAX/KnowledgeSources/RTX_KG2c_test_triples.json
- [x] update Expand code as needed
- [x] update any other modules as needed
- [x] test everything together:
- [x] check out the branch and pull to get the latest changes
- [x] locally set
force_local = True
inARAX_expander.py
(to avoid using the old KG2 API) - [x] then run the entire ARAX pytest suite (i.e.,
pytest -v
) - [x] address any failing tests
- [x] update the KG2 and ARAX version numbers in the appropriate places (in the branch for this KG2 version)
- [x] Bump version on line 12 in
RTX/code/UI/OpenAPI/python-flask-server/openapi_server/openapi/openapi.yaml
(github; local); the major and minor release numbers are kept synchronous with the TRAPI version; just bump the patch release version (least significant digit) - [x] Bump version on line 12 in
RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/openapi/openapi.yaml
(github; local); the first three digits are kept synchronous with the KG2 release version - [x] Bump version on line 4 in
RTX/code/UI/OpenAPI/python-flask-server/RTX_OA3_TRAPI1.4_ARAX.yaml
(github; local); same as for the ARAXopenapi.yaml
file - [x] Bump version on line 4 in
RTX/code/UI/OpenAPI/python-flask-server/RTX_OA3_TRAPI1.4_KG2.yaml
(github; local); same as for the KG2openapi.yaml
file
- [x] Bump version on line 12 in
4. Pre-upload databases:
Before rolling out, we need to pre-upload the new databases (referenced in config_dbs.json
) to arax.ncats.io
and the ITRB SFTP server. These steps can be done well in advance of the rollout; it doesn't hurt anything to do them early.
- [x] make sure
arax.ncats.io
has at least 100G of disk space free; delete old KG2 databases to free up space as needed (before doing this, warn the team on the#deployment
Slack channel on theARAXTeam
workspace) - [x] copy the new databases from
arax-databases.rtx.ai
toarax.ncats.io:/data/orangeboard/databases/KG2.X.Y
; example for KG2.8.0:- [x]
ssh [email protected]
- [x]
cd /data/orangeboard/databases/
- [x]
mkdir -m 777 KG2.8.0
- [x]
scp [email protected]:/home/rtxconfig/KG2.8.0/*2.8.0* KG2.8.0/
- [x]
- [x] upload the new databases and their md5 checksums to ITRB's SFTP server using the steps detailed here
5. Rollout new KG2c version to arax.ncats.io
development endpoints
-
[x] Notify the
#deployment
channel in theARAXTeam
Slack workspace that you are rolling out a new version of KG2c to the variousarax.ncats.io
development endpoints. -
[x] for the
RTXteam/RTX
project, merge themaster
branch into the branch for this KG2 version. Record this issue number in the merge message. -
[x] for the
RTXteam/RTX
project, merge this KG2 version's branch back into themaster
branch. Record this issue number in the merge message. -
[x] to roll
master
out to a specific ARAX or KG2 endpoint named/EEE
, you would do the following steps:- [x] If you are offsite, log into your office VPN (there are strict IP address block restrictions on client IPs that can ssh into
arax.ncats.io
) - [x] Log in to
arax.ncats.io
:ssh arax.ncats.io
(you previously need to have set up your username, etc. in~/.ssh/config
; see the top of this issue template for an example) - [x] Enter the
rtx1
container:sudo docker exec -it rtx1 bash
- [x] Become user
rt
:su - rt
- [x] Go to the directory of the code repo for the
EEE
endpoint:cd /mnt/data/orangeboard/EEE/RTX
- [x] Make sure it is on the master branch:
git branch
(should show* master
) - [x] Stash any updated files (this is IMPORTANT):
git stash
- [x] Update the code:
git pull origin master
- [x] Restore updated files:
git stash pop
- [x] If there have been changes to
requirements.txt
, make sure to dopip3 install -r code/requirements.txt
- [x] Become superuser:
exit
(exiting out of your shell session as userrt
should return you to aroot
user session) - [x] Restart the service:
service RTX_OpenAPI_EEE restart
- [x] View the STDERR logfile as the service starts up:
tail -f /tmp/RTX_OpenAPI_EEE.elog
- [x] Test the endpoint via the web browser interface to make sure it is working
- [x] Run version query:
{"nodes": {"n00": {"ids": ["RTX:KG2"]}}, "edges": {}}
(currently broken: #2306) - [x] look up
RTX:KG2
in the Synonyms tab in the UI
- [x] If you are offsite, log into your office VPN (there are strict IP address block restrictions on client IPs that can ssh into
-
[x] roll
master
out to the variousarax.ncats.io
development endpoints. Usually in this order:- [x]
devED
- [x]
kg2beta
- [x]
beta
- [x]
kg2test
- [x]
test
- [x]
devLM
- [x]
-
[x] inside the Docker
rtx1
container as userrt
, run the pytest suite on the various ARAX endpoints:- [x]
cd /mnt/data/orangeboard/EEE/RTX/code/ARAX/test && pytest -v
- [x] Make sure all tests are passing
- [x]
-
[x] update our CI/CD testing instance with the new databases:
- [x]
ssh [email protected]
- [x]
cd RTX
- [x]
git pull origin master
- [x] If there have been changes to
requirements.txt
, make sure to do~/venv3.9/bin/pip3 install -r requirements.txt
- [x]
sudo bash
- [x]
mkdir -m 777 /mnt/data/orangeboard/databases/KG2.X.Y
- [x]
exit
- [x]
~/venv3.9/bin/python3 code/ARAX/ARAXQuery/ARAX_database_manager.py --mnt --skip-if-exists --remove_unused
- [x] run a Test Build through GitHub Actions, to ensure that the CI/CD is working with the updated databases; all of the pytest tests that are not skipped, should pass
- [x]
6. Final items/clean up:
- [x] turn off the old KG2c version's neo4j instance
- [x] determine what is the DNS A record hostname for
kg2-X-Zc.rtx.ai
(whereZ
is one less than the new minor release version): runnslookup kg2-X-Zc.rtx.ai
(it will return eitherkg2canonicalized.rtx.ai
orkg2canonicalized2.rtx.ai
; we'll call itkg2canonicalizedN.rtx.ai
). - [x] message the
#deployment
channel in theARAXTeam
Slack workspace that you will be stopping thekg2canonicalizedN.rtx.ai
Neo4j endpoint - [x]
ssh [email protected]
- [x]
sudo service neo4j stop
- [x] In the AWS console, stop the instance
kg2canonicalizedN.rtx.ai
- [x] determine what is the DNS A record hostname for
- [x] turn off the old KG2c version's plover instance
- [x] Determine what is the DNS A record hostname for
kg2-X-Zcplover.rtx.ai
(whereZ
is one less than the new minor release version): runnslookup kg2-X-Zploverc.rtx.ai
(it will return eitherkg2cplover.rtx.ai
,kg2cplover2.rtx.ai
, orkg2cplover3.rtx.ai
; we'll call itkg2cploverN.rtx.ai
). - [x] message the
#deployment
channel in theARAXTeam
Slack workspace that you will be stopping thekg2-X-Zcplover.rtx.ai
PloverDB service - [x] Log into
kg2cploverN.rtx.ai
:ssh [email protected]
- [x] Stop the PloverDB container:
sudo docker stop plovercontainer2.X.Z
(if you are not sure of the container name, usesudo docker container ls -a
to get the container name).
- [x] Determine what is the DNS A record hostname for
- [x] turn off the new KG2pre version's neo4j instance (Coordinate with the KG2pre team before doing this)
- [x] deploy new PloverDB service into ITRB CI that is backed by the new KG2c database:
- [x] merge PloverDB
main
branch intokg2.X.Yc
branch (ifmain
has any commits ahead ofkg2.X.Yc
). Reference this issue (via its full GitHub URL) in the merge message. - [x] merge PloverDB
kg2.X.Yc
branch intomain
branch. Reference this issue (via its full GitHub URL) in the merge message. - [x] Verify that
kg_config.json
in themain
branch of the Plover repo to point to the newkg2c_lite_2.X.Y.json.gz
file - [x] wait about 60 minutes for Jenkins to build the PloverDB project and deploy it to
kg2cploverdb.ci.transltr.io
- [x] run Plover tests to verify it's working:
cd PloverDB && pytest -v test/test.py --endpoint https://kg2cploverdb.ci.transltr.io
- [x] run the ARAX pytest suite with the NCATS endpoint plugged in (locally change the URL in
RTX/code/config_dbs.json
and setforce_local = True
in Expand) - [x] if all tests pass, update
RTX/code/config_dbs.json
in themaster
branch to point to the ITRB Plover endpoints (all maturity levels): (dev
:kg2cploverdb.ci.transltr.io
;test
:kg2cploverdb.test.transltr.io
;prod
:kg2cploverdb.transltr.io
) - [x] push the latest
master
branch code commit to the various endpoints onarax.ncats.io
that you previously updated (this is in order to get the changedconfig_dbs.json
file) and restart ARAX and KG2 services - [x] check the Test Build (CI/CD tests) to make sure all non-skipped pytest tests have passed
- [x] turn off the self-hosted plover endpoint for the new version of KG2c
- [x] message the
#deployment
channel to notify people what you are about to do - [x]
ssh [email protected]
- [x]
sudo docker container ls -a
(gives you the name of the container; assume it isplovercontainer2.X.Y
) - [x]
sudo docker stop plovercontainer2.X.Y
- [x] message the
- [x] verify once more that ARAX is still working properly, even with the self-hosted new-KG2c-version PloverDB service turned off
- [x] merge PloverDB
- [x] upload the new
kg2c_lite_2.X.Y.json.gz
file to the translator-lfs-artifacts repo - [x] upload the new
kg2_nodes_not_in_sri_nn.tsv
file to the translator-lfs-artifacts repo
7. Roll-out to ITRB TEST
- [ ] In GitHub, for the RTXteam/RTX project, merge
master
toitrb-test
. Record this issue number in the merge message. - [ ] In GitHub, for the RTXteam/PloverDB project, merge
main
toitrb-test
. Record this issue number in the merge message. - [ ] Tag the release using the
master
branch of RTXteam/RTX project. - [ ] Tag the release using the
main
branch of RTXteam/PloverDB project. - [ ] Via a message in the
#devops-teamexpanderagent
channel in theNCATSTranslator
Slack workspace, put in a request to@Sarah Stemann
to open a ticket to re-deploy ARAX, RTX-KG2, and PloverDB to ITRB test - [ ] Monitor the
#devops-teamexpanderagent
channel to follow the roll-out of the updated services in ITRB test (i.e., to see if there are any errors reported by ITRB) - [ ] Check proper functioning of
kg2cploverdb.test.transltr.io
- [ ] from any git checkout of
RTXteam/PloverDB
project'smaster
branch, do :cd PloverDB && pytest -v test/test.py --endpoint https://kg2cploverdb.test.transltr.io
- [ ] from any git checkout of
- [ ] Check proper functioning of
kg2.test.transltr.io
(look at messages logdebug
mesages to verify that it is indeed queryingkg2cploverdb.test.transltr.io
) - [ ] Check proper functioning of
arax.test.transltr.io
(look at messages logdebug
mesages to verify that ARAX-Expand is indeed queryingkg2.test.transltr.io
)
8. Roll-out to ITRB PRODUCTION
- [ ] In GitHub, merge
master
toproduction
. Record this issue number in the merge message. - [ ] Via a message in the
#devops-teamexpanderagent
channel in theNCATSTranslator
Slack workspace, put in a request to@Sarah Stemann
to open a ticket to re-deploy ARAX, RTX-KG2, and PloverDB to ITRB production - [ ] Monitor the
#devops-teamexpanderagent
channel to follow (i.e., to see if there are any errors reported by ITRB) the roll-out of the updated services in ITRB production (this could take several days, as there is a formal approval process for deployments to ITRB production) - [ ] Check proper functioning of
kg2cploverdb.transltr.io
- [ ] Check proper functioning of
kg2.transltr.io
(look at messages logdebug
mesages to verify that it is indeed queryingkg2cploverdb.transltr.io
) - [ ] Check proper functioning of
arax.transltr.io
(look at messages logdebug
mesages to verify that ARAX-Expand is indeed queryingkg2.transltr.io
)