Fails to read Magres files with >100 atoms
When loading a Magres file with >100 atoms, the ms lines end up being like;
ms H100 2.0431056083497484E+01 -5.5892879764485039E+00 1.2418129913251554E+00 -4.0897852040395684E+00 2.9708843218858465E+01 -3.8875121992744055E+00 -1.1586267278454099E+00 5.0678781904684089E-01 2.4711193494436703E+01
e.g. the H and 100 are in the same block. In the following code (lines 150-155 in formats/magres.js), it splits the block by whitespace
for (let i = 0; i < block.length; ++i) {
let l = block[i];
let lspl = _.trim(l).split(/\s+/);
// Is it a 'units' line?
if (lspl[0] == 'units') {
let tag = lspl[1];
but for atoms where the atom label and number have concatenated this will only have 8 members and so the parseOneAtomLine will throw a Input matrix is not symmetric error.
Is the issue here that the H symbol and the 100 number have no space? I am sorry but that is a problem with the writer, not the reader, and I can't do much about it. The symbol in the magres block is a label, not a chemical element - it can be anything (these are defined in the atom lines earlier). That means that for example H123 couldn't be parsed if I didn't split by space, because nothing would separate H 123 from H1 23. I can imagine some hackish solution, of course, but I think this really goes beyond what is reasonable to expect of a reader to guess, and would likely be liable to produce unpredictable output for other inputs. What writer produced this file?
It was an output magres file from CASTEP 20.11, I guess maybe somewhere they're imagining that the format is defined by the character spacing and not the whitespace as in the magres spec? It only occurs in the ms/efg blocks, not in the atom block.
Sounds like a typical Fortran format string issue... yes, I would say this needs addressing. I can bring it up in CASTEP. I don't think it's format compliant to have no separating space. See the original format specification:
The file is essentially a series of blocks consisting of rows constaining whitespace-delimited records, with datatype distinguished by tags in the first column.
It's very clear about white spaces being the delimiter. No relying on just character counts.