kartograf icon indicating copy to clipboard operation
kartograf copied to clipboard

Mapping multimer protein components

Open ijpulidos opened this issue 6 months ago • 2 comments

Describe the bug When mapping gufe protein components that were built using multimeric PDBs, I'm observing that the map is only done to a part of the multimer, apparently only one of the monomers is mapped. I would expect kartograf to be able to map the components correctly, or complain if it doesn't.

To Reproduce

from kartograf import KartografAtomMapper
from gufe import ProteinComponent

# Create components from PDB Files
protein_comp = ProteinComponent.from_pdb_file("input.pdb")
mutated_comp = ProteinComponent.from_pdb_file("mutated.pdb")

mapper = KartografAtomMapper(atom_map_hydrogens=True)
mapping = next(mapper.suggest_mappings(protein_comp, mutated_comp))
print(len(mapping.componentA_to_componentB))

It seems to map only the chain "B" for some reason.

Expected behavior I expect the length of the mapping to be the number of atoms of the protein components minus the mutated ones, which should be just a few of them.

Screenshots

image

Additional context This would enable handling protein mutations in a more streamlined way. As far as I can tell, the way to do it right now would be to separate each monomer (each chain in the PDBs) to its own component and then mapping those independently, but that can be cumbersome for users.

PDB files to test in the following zip archive: Archive.zip

ijpulidos avatar Aug 23 '24 17:08 ijpulidos