[BUG] Sire selection allows out-of-bounds indexing of molecules
Not sure if this is the intended behaviour or not, but it appears that out-of-bounds indexing using MolIdx in a Sire selection always gives the last molecule, e.g.:
In [1]: import sire as sr
In [2]: mols = sr.load_test_files("ala.crd", "ala.top")
In [3]: mols.num_molecules()
Out[3]: 631
In [4]: mols["molidx 10000"]
Out[4]: Molecule( WAT:621 num_atoms=3 num_residues=1 )
In [5]: mols[-1]
Out[5]: Molecule( WAT:621 num_atoms=3 num_residues=1 )
This isn't true for other indexing, e.g. atoms and residues:
In [6]: mols.num_residues()
Out[6]: 633
In [7]: mols["residx 10000"]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1 │
│ │
│ /home/lester/.conda/envs/openbiosim/lib/python3.12/site-packages/sire/system/_system.py:75 in │
│ __getitem__ │
│ │
│ 72 │ │ return self.__str__() │
│ 73 │ │
│ 74 │ def __getitem__(self, key): │
│ ❱ 75 │ │ return self.molecules()[key] │
│ 76 │ │
│ 77 │ def __iadd__(self, molecules): │
│ 78 │ │ self.add(molecules) │
│ │
│ /home/lester/.conda/envs/openbiosim/lib/python3.12/site-packages/sire/mol/__init__.py:505 in │
│ __fixed__getitem__ │
│ │
│ 502 │ │ │ # try to search for the object - this will raise │
│ 503 │ │ │ # a SyntaxError if this is not a search term │
│ 504 │ │ │ # (and is instead a name) │
│ ❱ 505 │ │ │ return __from_select_result(obj.search(key, map=map)) │
│ 506 │ │ except SyntaxError: │
│ 507 │ │ │ # ignore SyntaxErrors as this is a name │
│ 508 │ │ │ pass │
│ │
│ /home/lester/.conda/envs/openbiosim/lib/python3.12/site-packages/sire/mol/__init__.py:364 in │
│ __from_select_result │
│ │
│ 361 │ │ ) │
│ 362 │ │
│ 363 │ if obj.list_count() == 0: │
│ ❱ 364 │ │ raise KeyError("Nothing matched the search.") │
│ 365 │ │
│ 366 │ typ = obj.get_common_type() │
│ 367 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'Nothing matched the search.'
This is surprising behavior, but looks intentional from how I wrote it...
In idengine.cpp there is;
static int map(int idx, int n)
{
if (idx >= 0)
return qMin(idx, n - 1);
else
return n - qMin(abs(idx), n);
}
which is mapping the searched idx against the total number of items in the container (n) and returning this mapped to the range [-n, n-1]. This is the source of the surprising behavior. I need to think about why I wrote it this way... And why not for the other index types.
There is difference between an internal index (like AtomIdx and ResIdx) which is fixed for a molecule during a search, and a "container" index (like MolIdx) which can have different meanings depending on the the container - which itself can depend on the search. I have a vague recollection that this was to handle an edge case, but would need to double-check to be sure.
No problem, I assumed that it might be something subtle depending on the container. This just tripped me up since I had been using try / except to handle invalid searches, but this was always returning a result so broke the logic.