kartograf
kartograf copied to clipboard
Mapping multimer protein components
Describe the bug
When mapping gufe
protein components that were built using multimeric PDBs, I'm observing that the map is only done to a part of the multimer, apparently only one of the monomers is mapped. I would expect kartograf
to be able to map the components correctly, or complain if it doesn't.
To Reproduce
from kartograf import KartografAtomMapper
from gufe import ProteinComponent
# Create components from PDB Files
protein_comp = ProteinComponent.from_pdb_file("input.pdb")
mutated_comp = ProteinComponent.from_pdb_file("mutated.pdb")
mapper = KartografAtomMapper(atom_map_hydrogens=True)
mapping = next(mapper.suggest_mappings(protein_comp, mutated_comp))
print(len(mapping.componentA_to_componentB))
It seems to map only the chain "B" for some reason.
Expected behavior I expect the length of the mapping to be the number of atoms of the protein components minus the mutated ones, which should be just a few of them.
Screenshots
Additional context This would enable handling protein mutations in a more streamlined way. As far as I can tell, the way to do it right now would be to separate each monomer (each chain in the PDBs) to its own component and then mapping those independently, but that can be cumbersome for users.
PDB files to test in the following zip archive: Archive.zip