mdanalysis Guessing the psf format type for VMD written psf files is wrong

Expected behavior

When loading the psf file, which was written by vmd 1.9.2 (VMD for LINUXAMD64, version 1.9.2 (December 29, 2014)), the format is not guessed correctly. The format is simply space separated and the header of the file is:

PSF

       1 !NTITLE
 REMARKS VMD-generated NAMD/X-Plor PSF structure file

  104813 !NATOM

In my structure (which I can't share, but I will do my best to reproduce the behaviour with an open structure in the near future) I have over 20000 waters, and I want each to be in a separate residue.

Actual behavior

I end up with 9999 residues for waters, some of which have more than 1 water. If I manually change the header to

PSF NAMD

       1 !NTITLE
 REMARKS VMD-generated NAMD/X-Plor PSF structure file

  104813 !NATOM

I am getting the expected behaviour.

I think that the problem is that when format is decided the REMARKS block is not taken into account.

Code to reproduce the behavior

u = mda.Universe("test_psf.psf")
len(set(u.select_atoms("resname SPC").resids))

Apr 08 '25 14:04 pbuslaev

@pbuslaev can you

provide a link or any other information about the PSF format that NAMD expects; for instance, I don't think that the original XPLOR/CHARMM PSF had REMARKS but I could be wrong. If our implementation of the PSF parser is ignoring something that's in a file format spec then that's an MDAnalysis bug. If VMD is just adding REMARKS then they are making their own format and we have to decide if we can support it.
provide a minimal example file that triggers the problem – if it's just REMARK then a tiny file should be sufficient.

Jun 11 '25 19:06 orbeckst

Hi! I would like to work on issue. Is this open?

Aug 07 '25 10:08 aditi25srivastava

We have not decided what to do here. Can you provide the information in https://github.com/MDAnalysis/mdanalysis/issues/5019#issuecomment-2963908556 ?

Aug 07 '25 15:08 orbeckst

Hey, @orbeckst dug into this and here’s what I think is happening: The parser handles NUMK fine, but it doesn’t seem to check NUMKW at all. So if a file has both, it just uses NUMK and ignores NUMKW, which messes up the residual wave count.

My plan to fix it:

Reproduce the bug with a small test file containing both NUMK and NUMKW (different values).
Update the parser so that if NUMKW is present, it takes priority over NUMK.
Make sure old files without NUMKW still work the same.
Add a couple of tests to catch this in the future.
(If useful) Do a quick sweep of the parser for other header fields that might have similar “ignore if exists” problems. If that sounds right to you, I can start working on a PR with the fix and tests.

Aug 12 '25 11:08 aditi25srivastava

@pbuslaev without further information/examples I don't know what else to do here. I am going to close the issue but would be happy to re-open if you or anyone else can provide additional information.

Sep 30 '25 22:09 orbeckst

mdanalysis mdanalysis copied to clipboard

Guessing the psf format type for VMD written psf files is wrong

Expected behavior

Actual behavior

Code to reproduce the behavior

mdanalysis
mdanalysis copied to clipboard