mdanalysis
mdanalysis copied to clipboard
Guessing the psf format type for VMD written psf files is wrong
Expected behavior
When loading the psf file, which was written by vmd 1.9.2 (VMD for LINUXAMD64, version 1.9.2 (December 29, 2014)), the format is not guessed correctly. The format is simply space separated and the header of the file is:
PSF
1 !NTITLE
REMARKS VMD-generated NAMD/X-Plor PSF structure file
104813 !NATOM
In my structure (which I can't share, but I will do my best to reproduce the behaviour with an open structure in the near future) I have over 20000 waters, and I want each to be in a separate residue.
Actual behavior
I end up with 9999 residues for waters, some of which have more than 1 water. If I manually change the header to
PSF NAMD
1 !NTITLE
REMARKS VMD-generated NAMD/X-Plor PSF structure file
104813 !NATOM
I am getting the expected behaviour.
I think that the problem is that when format is decided the REMARKS block is not taken into account.
Code to reproduce the behavior
u = mda.Universe("test_psf.psf")
len(set(u.select_atoms("resname SPC").resids))
@pbuslaev can you
- provide a link or any other information about the PSF format that NAMD expects; for instance, I don't think that the original XPLOR/CHARMM PSF had REMARKS but I could be wrong. If our implementation of the PSF parser is ignoring something that's in a file format spec then that's an MDAnalysis bug. If VMD is just adding REMARKS then they are making their own format and we have to decide if we can support it.
- provide a minimal example file that triggers the problem – if it's just REMARK then a tiny file should be sufficient.
Hi! I would like to work on issue. Is this open?
We have not decided what to do here. Can you provide the information in https://github.com/MDAnalysis/mdanalysis/issues/5019#issuecomment-2963908556 ?
Hey, @orbeckst dug into this and here’s what I think is happening: The parser handles NUMK fine, but it doesn’t seem to check NUMKW at all. So if a file has both, it just uses NUMK and ignores NUMKW, which messes up the residual wave count.
My plan to fix it:
- Reproduce the bug with a small test file containing both NUMK and NUMKW (different values).
- Update the parser so that if NUMKW is present, it takes priority over NUMK.
- Make sure old files without NUMKW still work the same.
- Add a couple of tests to catch this in the future.
- (If useful) Do a quick sweep of the parser for other header fields that might have similar “ignore if exists” problems. If that sounds right to you, I can start working on a PR with the fix and tests.
@pbuslaev without further information/examples I don't know what else to do here. I am going to close the issue but would be happy to re-open if you or anyone else can provide additional information.