mdanalysis
mdanalysis copied to clipboard
Support for numpy.dtypes.StringDType in AtomGroup containers?
In numpy 2.0+, the new dtype for strings in numpy are StringDType(). However, I don't believe MDAnalysis support this at all?
Consider the following code:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD
import numpy as np
u = mda.Universe(PSF, DCD)
is_not_string = u.residues.resnames
Running print(is_not_string.dtype) yields object. Since the dtype is objects, it means that any of the new numpy 2.0+ numpy.strings functions would not work and it would require the user to manually cast the ndarray over from the object dtype into numpy.dtypes.StringDType in order to get it working. Wouldn't it be easier just to have the instance just automatically create create the ndarray as StringDType for convenience?
Relevant numpy documentation: https://numpy.org/doc/stable/user/basics.strings.html https://numpy.org/doc/stable/reference/routines.strings.html#module-numpy.strings
At the moment we still support numpy ≥ 1.23.2
https://github.com/MDAnalysis/mdanalysis/blob/59e478db53ffb974fe94539bfc520c84a1946e72/package/pyproject.toml#L32
so we cannot use features only available in numpy 2.0+.
I am not sure when we will stop supporting numpy 1.x possibly in 2 years (~end of 2026), according to SPEC 0... but when that happens, we can use new types.