scryer-prolog icon indicating copy to clipboard operation
scryer-prolog copied to clipboard

`panic!` on recursive tabled relation

Open jasagredo opened this issue 1 year ago • 8 comments

I experienced this error when solving today's Advent of Code, and I seem to be able to hit it consistently. I thought I might report it for future investigation.

The source file is here, only the second part is relevant. It basically is a recursive function that splits numbers and counts how many of those are when N reaches 0.

In order to reproduce the panic I have to use my input file, yours will probably also trigger it, the sample doesn't trigger it. It is triggered every single time I run this query.

➜ RUST_BACKTRACE=1 scryer-prolog 11.pl
?- solve_tabled(7, "11.txt", Sol).
thread 'main' panicked at src\machine\machine_state_impl.rs:70:26:
index out of bounds: the len is 841960 but the index is 841972
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic_bounds_check
   3: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
   4: scryer_prolog::machine::attributed_variables::<impl scryer_prolog::machine::machine_state::MachineState>::gather_attr_vars_created_since
   5: scryer_prolog::machine::Machine::run_module_predicate
   6: scryer_prolog::run_binary
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

jasagredo avatar Dec 11 '24 21:12 jasagredo

Maybe related: Rust thread panics on certain input #2423

haijinSk avatar Dec 12 '24 10:12 haijinSk

Here's a minimal reproduction:

:- use_module(library(tabling)).
:- use_module(library(lists)).

:- table blink_tabled/3.

blink_tabled(0, _, 1).
blink_tabled(N, X, Xs) :-
    N > 0,
    N1 is N - 1,
    blink_tabled(N1, X, Xs).

bug(Xs) :-
    member(X, [1, 2, 3, 4, 5]),
    blink_tabled(7, X, Xs).

The bug triggers when one types ; three times when querying bug(Y).

adri326 avatar Dec 28 '24 15:12 adri326

@adri326: Awesome test case, thank you a lot!

It is not yet minimal (i.e., shortest possible) though, since using the following version of bug/1 in your code also triggers the issue and is even shorter:

bug(Xs) :-
    member(X, [1, 2, 3, 4]),
    blink_tabled(7, X, Xs).

triska avatar Dec 28 '24 20:12 triska

You're right, I think one could also remove an argument or two to blink_tabled and still trigger the bug.

I've been investigating more, and I keep circling back on AttrVarInitializer::attr_var_queue: I have no idea what its values mean, as CopyTermState::copy_attr_val_lists treats them as pointers to within Balls for global variables, whereas other parts of the code treat them as pointers to the heap, which become invalidated whenever the pointee gets placed within a separate heap in a Ball.

I'm stuck trying to figure out a fix to the issue, given that I couldn't find much documentation around the semantics of AttrVarInitializer :/

adri326 avatar Dec 28 '24 21:12 adri326

Good news: With the latest rebis-dev version, I get:

?- bug(Y).
   Y = 1
;  Y = 1
;  Y = 1
;  Y = 1
;  false.

Can anyone still reproduce any of the crashes discussed in this issue? If not, then please consider closing it. Thank you a lot!

triska avatar May 02 '25 16:05 triska

That is probably the test file, that one didn't trigger it before. I can try some day soon with the non-test input which is much much bigger.

jasagredo avatar May 07 '25 17:05 jasagredo

Markus @triska, I was thinking that rebis-dev is better now with tabling then master, but I'm afraid that it's not the case, in some examples like this. I'm afraid that even if rebis-dev tabling doesn't crash on various examples now, there is something inherently wrong.

Now, I tried the tabled predicate of the 11.pl file, with 11.txt (not included there, but I used the initial arrangement, a file with two numbers: 125 17).

With this version of run/0:

run :-
    time(
        (
            solve_tabled(75, "11.txt", Y),
            format("Task 1: ~w~n", [Y])
        )
    ),
    halt.

Parsing numbers from the input text file might be simplified, but it's not important here.

master almost without "any" memory consumption and with this time:

?- run.
Task 1: 65601038650482
   % CPU time: 17.380s, 18_036_720 inferences

rebis-dev with, cca, up to 5 GB (!) memory consumption and with this time:

?- run.
Task 1: 65601038650482
   % CPU time: 20.029s, 16_436_791 inferences

haijinSk avatar May 07 '25 19:05 haijinSk

@haijinSk: Very valuable, thank you a lot! Could you please post this in a new issue so that we can discuss it separately, and so that it also remains open even after the crash is addressed and the present issue is closed?

triska avatar May 07 '25 19:05 triska