scryer-prolog Benefit more from Rust to eliminate mistakes in the source code

An immediate example:

https://github.com/mthom/scryer-prolog/pull/2925#issuecomment-2840262154

If you have any more ideas, please share them in this issue. Thank you a lot!

Apr 29 '25 21:04 triska

An idea would be to create distinct types for heap and cell indexes, which are currently both usize even though they are not interchangeable. This has already caused bugs in rebis-dev. In fact, I'd argue that almost any kind of semantic index (code index, stack index, etc...) should be a distinct type. Distinct types would make them incompatible, and therefore the type checker would catch any confusion between them. The way to do that in Rust would be something like this:

// I don't think "heap index" is the best name for this.
// This indexes in the heap aligned to bytes.
struct HeapIndex(usize);

// This indexes in the heap aligned to cells (currently 8 bytes).
struct CellIndex(usize);

And then you would encapsulate all the interfaces that would actually need to know the underlying usize. In this case, we could implement Index for Heap with both of these types.

impl Index<HeapIndex> for Heap {
    type Output = HeapCellValue;

    fn index(&self, index: HeapIndex) -> &Self::Output {
       todo!() 
    }
}

impl Index<CellIndex> for Heap {
    type Output = u8;

    fn index(&self, index: CellIndex) -> &Self::Output {
       todo!() 
    }
}

In this specific case, the indexes can be converted into each other, so we could also implement (Try)From:

impl From<CellIndex> for HeapIndex {
    fn from(value: CellIndex) -> Self {
        HeapIndex(value.0 * size_of::<HeapCellValue>())
    }
}

impl TryFrom<HeapIndex> for CellIndex {
    type Error = ();

    fn try_from(value: HeapIndex) -> Result<Self, Self::Error> {
        if value.0 % size_of::<HeapCellValue>() == 0 {
            Ok(CellIndex(value.0 / size_of::<HeapCellValue>()))
        } else {
            Err(())
        }
    }
}

With this work and a bit of a refactor, we could eliminate the heap_index! and cell_index! macros and use .into() and .try_into() instead, while benefiting from strict type checking to avoid confusing them. This could go deeper with even more encapsulation and visibility to make them actually impossible to misuse even if you tried really hard, but this would already be good enough with some discipline.

Apr 29 '25 22:04 bakaq

Just to prove the point I made at https://github.com/mthom/scryer-prolog/pull/2925#issuecomment-2840262154, I benchmarked master with all the #[inline] and #[inline(always)] removed and it measurably improved performance in release while not affecting performance in debug. Very ironic. Like I said, LLVM is very smart and putting these hints (in a single crate project) actually limits how smart it can be about optimizations.

This is basically free performance on the table, but to avoid a bunch of merge conflicts I think we should wait for rebis-dev to get merged before actually removing the inlines.

May 17 '25 07:05 bakaq