graphein
graphein copied to clipboard
Fix incorrect node lookup in distance-based edge generation Fixes #418
Previously, the node indices from the distance matrix were used to access rows in the full PDB DataFrame (G.graph["pdb_df"]), assuming their indices aligned. This caused incorrect residue pairings when the filtered DataFrame used to compute distances had a different row order or subset of residues.
This patch introduces an explicit mapping from filtered DataFrame indices back to the original node IDs, ensuring that edges are created between the correct residues in the correct chains.
This resolves issues where edges were created between spatially distant residues or between unrelated chains.
Reference Issues/PRs
Fixes #418
What does this implement/fix? Explain your changes
Fixes a mismatch between the distance matrix indices and the full PDB DataFrame (G.graph["pdb_df"]) during edge creation. Ensures that spatial edges are added between correct residue pairs (same chain, correct distance) by mapping filtered indices back to their original node IDs.
What testing did you do to verify the changes in this PR?
I manually validated the fix using the Titin protein as a test case. For randomly selected residues (such as residue 0), I inspected their neighbors in ChimeraX and compared them against the neighbors returned by the updated code. All observed interactions were spatially coherent and within the specified threshold of 7Å. I repeated this process for several residues and found no incorrect long-distance interactions, confirming that the fix produces physically valid edges.
Pull Request Checklist
- [x] Ran
python -m pytest tests/and made sure that all unit tests pass - [x] Confirmed that the bug fix does not affect unrelated modules
- [x] Verified that no incorrect edges are created across chains
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code