torchdrug
torchdrug copied to clipboard
Why treat water in protein as glycine
In your protein.py (https://github.com/DeepGraphLearning/torchdrug/blob/a959f68f0c19f368be9e380f5a587de6970b3c67/torchdrug/data/protein.py#L1053), when the residue type is unknown (e.g. HOH: water), the residue will be treated as glycine. Will this affect the overall information of the protein? Why we can't ignore the water molecule here?
Hi, thx for the question. Here we simply assume that we only consider typical residues and all unknown residues will be treated as glycine. Since we only consider major atoms in proteins and water molecule isn't considered as a part of protein, we just discard it by treating it as an unknown residue type.
Hi, thx for the question. Here we simply assume that we only consider typical residues and all unknown residues will be treated as glycine. Since we only consider major atoms in proteins and water molecule isn't considered as a part of protein, we just discard it by treating it as an unknown residue type.
Is there a way to discard the unknown residues? Or it should be fixed in pre-process?