mdanalysis
mdanalysis copied to clipboard
Update secondary structure assignment in DSSP after an upstream fix
Is your feature request related to a problem?
The upstream library for secondary structure analysis, pyDSSP, has fixed a proline-related issue: https://github.com/ShintaroMinami/PyDSSP/issues/2. The MDAnalysis uses its code without DSSP being a dependency, so respective changes should be added to the MDAnalysis codebase.
Describe the solution you'd like
Figure out if the changes add value compared to existing PRO-related workarounds. If yes, find the meaningful code changes, introduce them to MDA codebase, and optionally remove PRO-related workarounds in existing code.
Describe alternatives you've considered
Not doing that seems ok too, I don't expect changes to be big.
Additional context
Figure out if the changes add value compared to existing PRO-related workarounds.
My suggestion would be to just re-vendor the files. Unless you suspect that the update would yield worse results, it's generally advisable to keep up with upstream. That way you avoid any case where someone might say "hey, our results don't match!".
I went through https://github.com/ShintaroMinami/PyDSSP/issues/2 and this is what I have understood, please let me know if it is right
Since proline has a secondary amine and has a rigid cyclic structure it doesn't form stable Hbonds. Due to this it cant form secondary structures like alpha helixes(H) and beta sheets(E). Originally PRO was getting assigned a secondary structure(H or E) but now the original implementation has fixed this?
So we have to update these files? As we dont want discrepancies with the original implementation
I created PR #5065 . It's not quite as simple as including vendored files because @marinegor rewrote the numpy code to avoid using einops. I gave it a try but would appreciate some 👀 on it from anyone who can spare a few 🧠 or 🤖 cycles.