Rarely used building blocks arousing suspicion
While looking at #2924, I originally suspected pstr_tail_idx as the cause. And in fact the function still seems suspicious to me even though #2925 apparently addresses the problem. In particular, what about this branch:
https://github.com/mthom/scryer-prolog/blob/af99c8c5cefc31f4df1b094bbb19e6c5f9a8e03e/src/machine/heap.rs#L548
When is the tail 2 cells away from the start? Does this case even arise in a place where the function is used?
In fact pstr_tail_idx is only used 4 times in the entire code base. The argument normally has the form pstr_loc + byte_offset, except in one case where it has the form pstr_loc, and that specific case appears in connection with #2924.
Another such function is read_s:
https://github.com/mthom/scryer-prolog/blob/af99c8c5cefc31f4df1b094bbb19e6c5f9a8e03e/src/machine/machine_state_impl.rs#L345
Again, and I assume by coincidence, only used 4 times in the entire code base, in specific branches of instruction dispatch. These branches appear at least more frequently used than pstr_tail_idx, making the function more likely to be correct.
I wonder whether there are better building blocks that we have not yet found, building blocks that are simpler and more generally useful than such isolated functions which, despite sounding so important, are in fact apparently barely necessary at all.
When is the tail 2 cells away from the start?
I think I at least found this out: The tail is 2 cells away if the string ends exactly at a cell boundary:
+----------+
| aaaaaaaa |
+----------+
| 0........|
+----------+
| []/0 +
+----------+
except in one case where it has the form
pstr_loc, and that specific case appears in connection with https://github.com/mthom/scryer-prolog/issues/2924.
And that specific case was found in systematic tests by @haijinSk to also lead to a crash (https://github.com/mthom/scryer-prolog/issues/2924#issuecomment-2844231241), addressed in #2934.
What other rare conditions lurk in this code? Can we somehow reduce the number of obscure places that can lead to a crash? Or test everything more systematically, to ensure that as many different situations as possible are tested?