cytokit
cytokit copied to clipboard
Document how to use cytokit on gcloud
Some preliminary notes:
Cytokit on gcloud
Spin up a machine on gcloud: 2 GPUs Nvidia K80
gcloud beta compute \
--project=hammerlab-chs \
instances create \
cytokit \
--zone=us-east1-c \
--machine-type=n1-highmem-16 \
--subnet=default \
--network-tier=PREMIUM \
--maintenance-policy=TERMINATE \
--service-account=195534064580-compute@developer.gserviceaccount.com \
--scopes="https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append" \
--accelerator=type=nvidia-tesla-k80,count=2 \
--tags=http-server,https-server \
--image=ubuntu-1604-xenial-v20180627 \
--image-project=ubuntu-os-cloud \
--boot-disk-size=1000GB \
--boot-disk-type=pd-standard \
--boot-disk-device-name=cytokit
gcloud compute ssh --zone us-east1-c cytokit
Inspiration: https://medium.com/google-cloud/jupyter-tensorflow-nvidia-gpu-docker-google-compute-engine-4a146f085f17
# Install all the things as root to work around many issues
sudo su -
#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda; then
# The 16.04 installer works with 16.10.
curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get update
apt-get install cuda -y
fi
# Sanity check (should see all GPUs listed here)
nvidia-smi
#/bin/bash
# install packages to allow apt to use a repository over HTTPS:
apt-get -y install \
apt-transport-https ca-certificates curl software-properties-common
# add Docker’s official GPG key:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# set up the Docker stable repository.
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# update the apt package index:
apt-get -y update
# finally, install docker
apt-get -y install docker-ce
wget https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
dpkg -i nvidia-docker*.deb
# Sanity check (should run without any issues)
nvidia-docker run --rm nvidia/cuda nvidia-smi
exit # from root
# Sudoless docker setup:
sudo usermod -aG docker $USER
sudo systemctl restart docker
exit # logout completely
gcloud compute ssh ... # new login
# Sanity check (should run as a user)
docker run hello-world
nvidia-docker run --rm nvidia/cuda nvidia-smi
Setup is done. Let's pull in relevant programs and scripts:
cd $HOME
mkdir repos
cd repos
git clone https://github.com/hammerlab/cytokit.git && mv cytokit codex
git clone https://github.com/hammerlab/cvutils.git
git clone https://github.com/hammerlab/cell-image-analysis.git
cat << EOF >> ~/cytokit.env
export CODEX_DATA_DIR=$HOME/data
export CODEX_REPO_DIR=$HOME/repos/codex
export CVUTILS_REPO_DIR=$HOME/repos/cvutils
export CODEX_ANALYSIS_REPO_DIR=$HOME/repos/cell-image-analysis
EOF
source ~/cytokit.env
mkdir -p $CODEX_DATA_DIR
cd $CODEX_DATA_DIR
gsutil cp -r gs://musc-codex/models .
cd $CODEX_DATA_DIR
mkdir 20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5
cd 20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5
gsutil -m cp -r gs://musc-codex/datasets/20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5 .
mv 20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5 raw
We now have all we need (scripts/data). Let's run the analysis:
cd $CODEX_REPO_DIR/docker
nvidia-docker build -t codex-analysis -f Dockerfile.dev .
nvidia-docker run -ti -p 8888:8888 -p 6006:6006 -p 8787:8787 -p 8050:8050 --rm \
-v $CODEX_DATA_DIR:/lab/data \
-v $CODEX_REPO_DIR:/lab/repos/codex \
-v $CODEX_ANALYSIS_REPO_DIR:/lab/repos/codex-analysis \
-v $CVUTILS_REPO_DIR:/lab/repos/cvutils \
-e CODEX_CYTOMETRY_2D_MODEL_PATH=/lab/data/models/r0.3/nuclei_model.h5 \
-e CODEX_CACHE_DIR=/lab/data/.codex/cache \
codex-analysis
You can now connect to your notebook running on your gcloud instance using its public ID. Once on it, create a new console tab and run the following:
#!/usr/bin/env bash
EXP_NAME="20180614_D22_RepB_Tcell_CD4-CD8-DAPI_5by5"
CODEX_DATA_DIR=/lab/data
EXP_DIR=$CODEX_DATA_DIR/$EXP_NAME
CODEX_ANALYSIS_REPO_DIR=/lab/repos/codex-analysis/
EXP_CONF=$CODEX_ANALYSIS_REPO_DIR/config/experiment/$EXP_NAME/experiment.yaml
EXP_OUT=$EXP_DIR/output/v01
echo "Processing experiment $EXP_NAME"
cytokit processor run \
--config-path=$EXP_CONF \
--data-dir=$EXP_DIR/raw \
--output-dir=$EXP_OUT \
--run-drift-comp=False \
--run-best-focus=True \
--run-deconvolution=True \
--gpus=[0,1] --py-log-level=info
cytokit operator \
extract \
--config-path=$EXP_CONF \
--data-dir=$EXP_OUT \
--name='best_z_segm' \
--channels=['proc_dapi','proc_cd4','proc_cd8','cyto_cell_boundary','cyto_nucleus_boundary'] - \
montage \
--name='best_z_segm' \
--extract-name='best_z_segm'
cytokit analysis aggregate_cytometry_statistics \
--config-path=$EXP_CONF \
--data-dir=$EXP_OUT \
--mode='best_z_plane'
Should be done in < 30 minutes. You can gsutil cp
it to a bucket and turn your gcloud box off.
Need to clean this up a bit.