ProLIF
ProLIF copied to clipboard
how can i get the ligand atom id which have a interaction with pritein atom?
thanks for your outstnding works ,but how can i find which atom pair between ligand and protein have a interaction? thanks!
Hi @CAODH,
The best for this is to do it with a dataframe.
Supposing you've already run the interaction calculation, you can then do df = fp.to_dataframe(return_atoms=True). The resulting dataframe will give you ligand and protein atom indices for the corresponding residues and for each interaction.
You can find some more in depth scripts using this in the documentation https://prolif.readthedocs.io/en/latest/notebooks/visualisation.html
Hope that helps, Cedric
Hi Cédric, me again.
Am I right in understanding that these index numbers relate to the prolif.Residue atom indexing? Do you have any suggestions for how to access the index of the interacting atom in the original .pdb or .mol2 file, where the atomic indices are different to those relative to the prolif.Residue indices?
Thanks, Noah
Indeed, those index correspond to the atom numbering in the residue, not in the input.
To retrieve the index in the actual file, you could try this (assuming you already have the dataframe with atom indices):
from prolif.interactions import get_mapindex
u = mda.Universe(...)
lig = u.select_atoms(...)
prot = u.select_atoms(...)
lmol = plf.Molecule.from_mda(lig)
pmol = plf.Molecule.from_mda(prot)
def convert_index(s):
lres, pres, _ = s.name
for i, (li, pi) in enumerate(s):
if li is None:
continue
li = get_mapindex(lmol[lres], li)
pi = get_mapindex(pmol[pres], pi)
s[i] = (lig[li].index, prot[pi].index)
return s
df_index = df.apply(convert_index)
Hi, I would like to pile on this issue, as I find the mismatch between the two indexing systems highly confusing when dealing with protein-protein interactions. i am looking at the following interaction, which I isolated from a larger p-p complex ('test.pdb'):
ATOM 1 N TYR A 187 91.367 73.171 56.258 1.00 0.00 N
ATOM 2 HN TYR A 187 91.352 72.616 57.086 1.00 0.00 H
ATOM 3 CA TYR A 187 90.893 74.512 56.423 1.00 0.00 C
ATOM 4 HA TYR A 187 91.009 75.065 55.502 1.00 0.00 H
ATOM 5 CB TYR A 187 89.431 74.530 56.936 1.00 0.00 C
ATOM 6 HB1 TYR A 187 89.355 74.033 57.926 1.00 0.00 H
ATOM 7 HB2 TYR A 187 89.053 75.570 57.033 1.00 0.00 H
ATOM 8 CG TYR A 187 88.546 73.791 55.976 1.00 0.00 C
ATOM 9 CD1 TYR A 187 88.119 74.416 54.797 1.00 0.00 C
ATOM 10 HD1 TYR A 187 88.444 75.424 54.581 1.00 0.00 H
ATOM 11 CE1 TYR A 187 87.276 73.742 53.903 1.00 0.00 C
ATOM 12 HE1 TYR A 187 86.938 74.231 53.001 1.00 0.00 H
ATOM 13 CZ TYR A 187 86.843 72.447 54.194 1.00 0.00 C
ATOM 14 OH TYR A 187 85.960 71.796 53.320 1.00 0.00 O
ATOM 15 HH TYR A 187 85.772 70.940 53.712 1.00 0.00 H
ATOM 16 CD2 TYR A 187 88.149 72.467 56.227 1.00 0.00 C
ATOM 17 HD2 TYR A 187 88.508 71.952 57.107 1.00 0.00 H
ATOM 18 CE2 TYR A 187 87.285 71.799 55.350 1.00 0.00 C
ATOM 19 HE2 TYR A 187 86.970 70.792 55.581 1.00 0.00 H
ATOM 20 C TYR A 187 91.769 75.136 57.482 1.00 0.00 C
ATOM 21 O TYR A 187 92.303 74.448 58.347 1.00 0.00 O
TER 22 TYR A 187
ATOM 23 N CYS F 2 85.446 76.051 56.283 1.00 0.00 N
ATOM 24 HN CYS F 2 85.896 76.803 56.757 1.00 0.00 H
ATOM 25 CA CYS F 2 85.108 74.901 57.117 1.00 0.00 C
ATOM 26 HA CYS F 2 85.533 74.018 56.662 1.00 0.00 H
ATOM 27 CB CYS F 2 85.780 75.087 58.513 1.00 0.00 C
ATOM 28 HB1 CYS F 2 86.820 74.708 58.425 1.00 0.00 H
ATOM 29 HB2 CYS F 2 85.883 76.181 58.679 1.00 0.00 H
ATOM 30 SG CYS F 2 84.971 74.366 60.001 1.00 0.00 S
ATOM 31 HG1 CYS F 2 86.068 74.221 60.731 1.00 0.00 H
ATOM 32 C CYS F 2 83.611 74.621 57.180 1.00 0.00 C
ATOM 33 O CYS F 2 83.184 73.482 57.034 1.00 0.00 O
END
my code is as follows :
mol = Chem.MolFromPDBFile('test.pdb', removeHs=False)
u = mda.Universe(mol)
Ligand = plf.Molecule.from_mda(u, "chainID F and resid 2")
Receptor = plf.Molecule.from_mda(u, "byres chainID A B C D E and around 5 (chainID F and resid 2)")
fp = plf.Fingerprint()
ifp = fp.generate(Ligand, Receptor, return_atoms=True)
ifp["Frame"] = 0
df = plf.to_dataframe([ifp], fp.interactions.keys(),return_atoms=True)
and prolif reports a single interaction between atoms 2 and 4 of protein 1 ("Ligand") and protein2 ("Receptor"), respectively :
ligand CYS2.F
protein TYR187.A
interaction Hydrophobic
Frame
0 (2, 4)
Using get_mapindex on these atoms leaves the indexes unchanged, meaning that the proposed interaction is between the CA of the CYS residue to the CB of the TYR residue. Looking at the structure, these residues are 4.34Å apart, quite far for a hydrophobic interaction. Indeed, this interaction is not picked up by UCSF-Chimera. However, the CG of the TYR is only 3.78Å from the CA of the CYS and does interact with it, according to the conventional cutoff, and is picked by Chimera. Can this difference between the software suits be an indexing issue? it would be nice if the get_mapindex function would have a flag to return the actual atom names from the PDB file (CA,CB, CG . . . etc) to reduce the confusion level further.
Hi @izhar1973,
Looking at the structure, these residues are 4.34Å apart, quite far for a hydrophobic interaction. Indeed, this interaction is not picked up by UCSF-Chimera
By default the Hydrophobic interaction in ProLIF goes up to 4.5Å, but you can modify the distance cutoff by adding this code snippet before creating the plf.Fingerprint object:
class Hydrophobic(plf.interactions.Hydrophobic):
def __init__(self):
super().__init__(distance=4.0)
However, the CG of the TYR is only 3.78Å from the CA of the CYS and does interact with it
ProLIF will only return the first atoms that satisfy the SMARTS patterns and geometrical constraints of a given interaction in a given pair of residue, not the "best" (shortest distance) one.
it would be nice if the get_mapindex function would have a flag to return the actual atom names from the PDB file (CA,CB, CG . . . etc)
I think you can do this with one of the following (not 100% sure the will work):
Ligand["CYS2.F"].GetAtomWithIdx(2).GetMonomerInfo().GetName()orLigand.GetAtomWithIdx(get_mapindex(Ligand["CYS2.F"], 2)).GetMonomerInfo().GetName()
but I agree that it could be useful to have!
Hope this answers your questions, Cédric