Missing sidechain of target in output
Hi community, I am using RFdiffusion to design a binder for a target protein. After running the process, I've noticed that in the output PDB file, my target protein is only represented by its backbone (the sidechains are missing).
Is this the expected behavior?
What is the recommended workflow to visualize the final complete complex?
Thanks!
Yes, this is expected. RFdiffusion is used for backbone generation. It does not predict residue identities.
The classic workflow (as exemplified by the experimental examples in the original RFdiffusion paper, and others since) is to take the backbone generated from RFdiffusion and pass it through ProteinMPNN (or LigandMPNN, etc.) to predict the sidechain identities. You can then take that sequence and run it though a structure prediction program (like AlphaFold or Boltz) to see what the sequence is predicted to fold to. Compare that with the RFdiffusion generated structure as well as using other quality filters (predicted binding energy, shape complementarity, hydrogen bonding, etc.) to select which designs you want to take forward experimentally.
Thank you so much for your response! However, what I want to focus here is the sidechain of the protein target, not the designed binder. I open the output file in PyMol and Chimera, but can not show the sidechain of it anymore like when I open the original PDB of my target. It seems like the output of RFdiffusion masked and hide them. The same issue is happened with MPNN output. My designed binder is still good.
Ah yes, sorry. I believe that is the intended behavior as well. The internals of RFdiffusion only work on backbone coordinates, so only the backbones (and amino acid identity) of your target input is read in. The sidechain positions are ignored. So when RFdiffusion writes out the structure of the complex, it's only writing out the backbone coordinates, even for residues which had sidechains in the input.
So after finishing the whole process (backbone generation -> sequence generation), how can we analyze and visualize the interaction between the binder and the target?
You'd generally take the sequences, put them through a structure prediction program (Alphafold/Boltz/Chai)(*) and then run the analysis on the predicted results.
Ideally that would work. You may get issues with multimer prediction tools and the lack of cross-chain MSA correlations leading to poor prediction of the inter-chain orientations. Ideally the designs will be good enough to work even without that, but if you're struggling, you can always predict things as monomers, then manually overlay them in the designed configuration. You could potentially also use a classic protein docking program to sample the chain orientation.
I'll also mention that LigandMPNN has a sidechain prediction functionality, so if you wanted to skip the re-prediction, you could always use LigandMPNN instead of ProteinMPNN (it should work even on protein-only structures), and use the sidechain prediction to generate an all-atom structure.
*) I generally omit RoseTTAFold from this recommendation, simply because RFdiffusion is based on RoseTTAFold. Using an orthogonal structure prediction program helps minimize the likelihood you're in some anomalous minimum of the RoseTTAFold landscape. It's probably not 100% necessary if you prefer using RoseTTAFold for structure prediction.
Thank you so much!
I'll also mention that LigandMPNN has a sidechain prediction functionality, so if you wanted to skip the re-prediction, you could always use LigandMPNN instead of ProteinMPNN (it should work even on protein-only structures), and use the sidechain prediction to generate an all-atom structure.
Hmmm, I have used LigandMPNN for sequence generation step in my case. As a usual, I put the RFdiffusion output file to LigandMPNN and choose only sequence prediction for the chain A - designed binder (chain B - protein target is fixed), and I also used side-chains packing feature of LigandMPNN. But it is imposible to visualize the sidechains of protein target in the LigandMPNN output. Does you have any recommendation for me, sir?