RFdiffusion icon indicating copy to clipboard operation
RFdiffusion copied to clipboard

RFPpeptides protocol

Open alejandromontesa-unizar opened this issue 5 months ago • 12 comments

Hi everybody, I am trying to follow the procedure in the SI of the RFPeptides paper to design macrocyclic binders against a target protein. I have correctly launched RFDiffusion with the cyclic flags, and obtain .pdbs of the complex of my protein and the binder. Then I run ProteinMPNN with the options listed in the paper.

But now the procedure tells to run FastRelax, however i do not know how to generate the .pdb of the complex with the ProteinMPNN sequence instead the polyG and they do not tell how they model this. Any advise?

Image

alejandromontesa-unizar avatar Jul 24 '25 09:07 alejandromontesa-unizar

I know that LigandMPNN has an option where it can output a PDB structure, though I don't think plain ProteinMPNN has that option.

Normally the approach is to take the sequences output by ProteinMPNN and then feed those into the AI structure prediction tool of your choice (historically AlphaFold, but recently Boltz, Chai, or ColabFold are also options). Usually there's a quality filtering step, where you discard sequences which fold to structures too dissimilar from the input backbone. (Which is the reason I didn't mention RosettaFold in the previous list -- since RFdiffusion is based on RosettaFold, it often helps to have an orthogonal prediction program to help validate the structure.)

roccomoretti avatar Jul 24 '25 16:07 roccomoretti

Hi sorry for not answering before, I understand the process that you suggest and it is done for example in RFdiffusion paper to generate linear binders against a target... However reading carefully this SI for RFpeptides protocol it does not seem to be the process followed so I am a little bit confused... Thanks anyway for the advise.

Also maybe this is a silly doubt, but it is getting on my nerves hahah. When obtaining the output from RFdiffusion, no side chains are in the binder nor in the target. It is necessary to "reconstruct" the side chains in the target before running ProteinMPNN/LigandMPNN?

alejandromontesa-unizar avatar Jul 28 '25 20:07 alejandromontesa-unizar

From https://www.nature.com/articles/s41589-025-01929-w#Sec7

Macrocyclic peptide monomers and binders were designed with RFpeptides using a three-stage pipeline: backbone generation using RFdiffusion with the cyclic offset applied to the peptide chains, followed by sequence design using ProteinMPNN and, finally, structure prediction of the designed peptide–target complexes using either AfCycDesign and/or RoseTTAFold with the cyclic offset applied to the peptide.

They don't mention it explicitly, but I'm assuming that the ProteinMPNN results were sequence only, and they only got the 3D structures of the design post AfCycDesign/RoseTTAFold prediction.

Regarding your second point, it's not unexpected that RFdiffusion removes sidechains from the representation of any input position that it passes through. (RFdiffusion works entirely with protein backbones, and does not have any internal representation of sidechain conformations.) That's not a limitation for the ProteinMPNN protocol. ProteinMPNN also works on backbone coordinates (versus sidechain coordinates), and is explicitly designed to work with the outputs of RFdiffusion.

roccomoretti avatar Jul 28 '25 22:07 roccomoretti

I have used the ProteinMPNN-FasRelax combo in this repo (https://github.com/nrbennet/dl_binder_design) to generate a sequence and relaxed it to the backbone. But you still need to use AfCycDesign to predict and filter.

Raef88 avatar Aug 26 '25 13:08 Raef88

I have used this combo for binders but didn't tried for cyclic peptides, how silly... thanks for the advice

alejandromontesa-unizar avatar Aug 26 '25 13:08 alejandromontesa-unizar

Hi @Raef88, i tried the dl_binder_design but when doing the ProteinMPNN+fast relax combo the binder is not cyclized despite the ouput from RFdiffusion is good (is a cycle). Do i have to modify something in the process or add some extra flags or something?

alejandromontesa-unizar avatar Aug 27 '25 16:08 alejandromontesa-unizar

hi @alejandromontesa-unizar , have you reproduced the pipeline in RFpeptides? I wonder if you know where to find the python version code (instead of the colab one) of Afcycdesign for large scale structure prediction?

AzusaXuan avatar Sep 16 '25 10:09 AzusaXuan

Hi @AzusaXuan, I am still working on it. I generated the cyclic binders with RFpeptides for my target, then used proteinMPNN from dl_binder_design github repo scripts to generate .pdbs of the cyclic peptides with the target (the binder will have the sequence designed). Then I installed AfCycDesign with ColabDesign for generating the cross-validation complexes. I have use some inhouse scripts to do large scale prediction, but know nothing about a python version code, sorrry

alejandromontesa-unizar avatar Sep 16 '25 10:09 alejandromontesa-unizar

Hi @Raef88, i tried the dl_binder_design but when doing the ProteinMPNN+fast relax combo the binder is not cyclized despite the ouput from RFdiffusion is good (is a cycle). Do i have to modify something in the process or add some extra flags or something?

Hi @alejandromontesa-unizar , If I understood correctly, you mean the chain is cyclic but not closed? PyMol and Chimera don't show the closed conformation, as noted by another person; however, using other software called ICM-Browser shows the C-N closure.

Raef88 avatar Sep 16 '25 10:09 Raef88

No, what I mean is that from dl_binder_design you only need to do the ProteinMPNN part, without a relax cycle... so it generates a .pdb of the binder-target complex but with a seq for the binder. Then you have to copy the fast-relax script they give in RFpeptides paper's SI and use it for every .pdb. So you will have a .pdb with the target-biner complex relaxed. And then you can run AfCycDesign with that .pdb. Remember to omit_AAs Cys when doing the ProteinMPNN part to ensure there are not intern disulphide bonds

alejandromontesa-unizar avatar Sep 16 '25 11:09 alejandromontesa-unizar

Image

This is the .xml for relax part that might be run with Rosetta. And I copy you a inhouse script for running this in a folder of .pdbs:

import os
from pyrosetta import *
from pyrosetta.rosetta.protocols.rosetta_scripts import XmlObjects

# Initialize PyRosetta once
init("-beta_nov16")

# Load XML once
xml = "fast_relax.xml"
objs = XmlObjects.create_from_file(xml)
fr = objs.get_mover("full_relax_complex")
pcm = objs.get_mover("pcm")

# Input/output folders
inp_dir = "./outputs/FUT8-cyclic-second"
out_dir = "./outputs/FUT8-cyclic-relaxed2"

# Create output dir if it doesn't exist
os.makedirs(out_dir, exist_ok=True)

prefixes = "length12"

# Loop over PDB files
for pdb_file in sorted(os.listdir(inp_dir)):
    if not pdb_file.endswith(".pdb"):
        continue
    if not pdb_file.startswith(prefixes):
        continue

    pdb_path = os.path.join(inp_dir, pdb_file)
    name, _ = os.path.splitext(pdb_file)

    print(f"Processing {pdb_file}...")

    # Load pose
    pose = pose_from_pdb(pdb_path)

    # Apply movers
    pcm.apply(pose)
    fr.apply(pose)
    pcm.apply(pose)

    # Save relaxed pose in new folder
    out_path = os.path.join(out_dir, f"{name}_relaxed.pdb")
    pose.dump_pdb(out_path)

print(f"✅ Selected PDBs processed. Results saved in {out_dir}")

alejandromontesa-unizar avatar Sep 16 '25 11:09 alejandromontesa-unizar

Hi @AzusaXuan, I am still working on it. I generated the cyclic binders with RFpeptides for my target, then used proteinMPNN from dl_binder_design github repo scripts to generate .pdbs of the cyclic peptides with the target (the binder will have the sequence designed). Then I installed AfCycDesign with ColabDesign for generating the cross-validation complexes. I have use some inhouse scripts to do large scale prediction, but know nothing about a python version code, sorrry

Hi bro @alejandromontesa-unizar, I’m following the same steps you took. I’ve already obtained the .pdb files of the complexes between the cyclic peptides and the target. My next goal is to use AfCycDesign for complex structure prediction and to filter some of the designs.

However, since AfCycDesign only provides a Jupyter notebook, I’m not sure how to set it up locally and use it for prediction. Could you give me some hints or guidance on how to do this?

Thanks a lot!

Nicole-DH avatar Oct 20 '25 08:10 Nicole-DH