openfold
openfold copied to clipboard
OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference
Overview
This PR introduces a fully featured Local Notebook for performing inference, obtaining metrics, ranking the best model, and generating plots in a structured and reproducible manner, particularly for experimentation with large datasets.
The metrics are similar to those in the Colab notebook but optimized for a local installation with Docker. It also introduces parallel execution to leverage multiple GPUs.
The notebook operates by executing Docker commands using the Docker client and accessing OpenFold functions within a standalone environment. This approach ensures that the OpenFold codebase remains unaffected, serving as a client to help reproduce metrics and results from the Colab notebook locally.
Usage
Refer to instructions in notebooks/OpenFoldLocal.ipynb
Setup the notebook
Fist, build Openfold using Docker. Follow this guide.
Then, go to the notebook folder
cd notebooks
Create an environment to run Jupyter with the requirements
mamba create -n openfold_notebook python==3.10
Activate the environment
mamba activate openfold_notebook
Install the requirements
pip install -r src/requirements.txt
Start your Jupyter server in the current folder
jupyter lab . --ip="0.0.0.0"
Access the notebook URL or connect remotely using VSCode.
Inference example
Initializing the client:
import docker
from src.inference import InferenceClientOpenFold
# You can also use a remote docker server
docker_client = docker.from_env()
# Initialize the OpenFold Docker client setting the database path
databases_dir = "/path/to/databases"
openfold_client = InferenceClientOpenFold(databases_dir, docker_client)
Running Inference:
# For multiple sequences, separate sequences with a colon `:`
input_string = "DAGAQGAAIGSPGVLSGNVVQVPVHVPVNVCGNTVSVIGLLNPAFGNTCVNA:AGETGRTGVLVTSSATNDGDSGWGRFAG"
model_name = "multimer" # or "monomer"
weight_set = 'AlphaFold' # or 'OpenFold'
# Run inference
run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_string)
Using a file:
input_file = "/path/to/test.fasta"
run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_file)