mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

EXTXYZ format support

Open hmacdope opened this issue 3 years ago • 7 comments

Is your feature request related to a problem?

The EXTXYZ format is a slight extension on the XYZ format that allows atom specific attributes and a unit cell to be specified.

As far as I can tell the most formal definition of the EXTXYZ format is by ASE in this document.

There was a question regarding reading this format on the mailing list

Describe the solution you'd like

Supporting this format would not be a difficult extension on the XYZ Reader / writer. This could also possibly be achieved by interoperability with ASE itself see #3827.

Additional context

Chemfiles has support for unit cell reading with XYZ but reading that file into ASE appeared to be an issue in the mailing list post referenced above (tagging @Luthaf for your info :) ) but I havn't replicated it myself.

There is also a C and Julia implementation of an EXTXYZ reader, so it may be possible to wrap the C in Cython if we want to. Initially I would be inclined to stick with a native python implementation.

hmacdope avatar Sep 14 '22 08:09 hmacdope

tagging @Luthaf for your info

Thanks for the heads up!

Extended XYZ is not a very well specified format unfortunately. The repo you linked (https://github.com/libAtoms/extxyz) is trying to write a specification for it, and I tried to convert this specification as a formal grammar here. There is a way to take this formal grammar and create a pure Python parser, adding lark as a dependency. The same formal grammar + parser would ideally also be added to ASE, which requires all code to be pure Python.

Luthaf avatar Sep 14 '22 09:09 Luthaf

I wonder if anyone is working on this issue? And if no, is it possible to jump on that?

And additional question if it's interesting for the @coredevs to write this with pyo3? I recently did it for my own little project, which worked surprisingly well for me. AFAIK, pyo3 takes care of building wheels for all platforms, and the code doesn't have any python dependencies afterwards, which is kind of nice. The issue feels not crucial enough to try including pyo3, hence the suggestion.

marinegor avatar Feb 07 '24 17:02 marinegor

@marinegor as far as I'm aware no one is working on such a reader right now (someone please correct me). I am meant to deliver an ASE converter as part of EOSS4 (probably during my upcoming holidays), but that doesn't matter to this issue afaik.

Regarding a rust-based reader, personally I really wouldn't want to have this in the core library. If it's an MDAKit I don't mind, but adding pyo3 on top of the cython stuff we have is just a recipe for a super complicated deployment. Our package is already a headache as it is (edit: we're talking multi-day deployment and packaging fixes already).

IAlibay avatar Feb 07 '24 18:02 IAlibay