torchdrug icon indicating copy to clipboard operation
torchdrug copied to clipboard

Why treat water in protein as glycine

Open YanjingLiLi opened this issue 1 year ago • 2 comments

In your protein.py (https://github.com/DeepGraphLearning/torchdrug/blob/a959f68f0c19f368be9e380f5a587de6970b3c67/torchdrug/data/protein.py#L1053), when the residue type is unknown (e.g. HOH: water), the residue will be treated as glycine. Will this affect the overall information of the protein? Why we can't ignore the water molecule here?

YanjingLiLi avatar Jun 13 '23 00:06 YanjingLiLi

Hi, thx for the question. Here we simply assume that we only consider typical residues and all unknown residues will be treated as glycine. Since we only consider major atoms in proteins and water molecule isn't considered as a part of protein, we just discard it by treating it as an unknown residue type.

Oxer11 avatar Jun 13 '23 12:06 Oxer11

Hi, thx for the question. Here we simply assume that we only consider typical residues and all unknown residues will be treated as glycine. Since we only consider major atoms in proteins and water molecule isn't considered as a part of protein, we just discard it by treating it as an unknown residue type.

Is there a way to discard the unknown residues? Or it should be fixed in pre-process?

Plovor avatar Aug 18 '23 09:08 Plovor