rebis-dev: Memory consumption with tabling
As I wrote for Markus @triska here: https://github.com/mthom/scryer-prolog/issues/2701#issuecomment-2859994956
I thought that rebis-dev is better with tabling then master now; but I'm afraid that it's not the case in some examples. I'm afraid that even if rebis-dev tabling doesn't crash/Rust-panic on various examples now, there is something inherently wrong.
I tried the tabled predicate of this file: 11.pl , with 11.txt (not included there, but I used the initial arrangement, a file with two numbers: 125 17).
With this version of run/0:
run :-
time(
(
solve_tabled(75, "11.txt", Y),
format("Task 1: ~w~n", [Y])
)
),
halt.
master ~~almost without "any"~~ with up to 0.5 GB memory consumption and with this time:
?- run.
Task 1: 65601038650482
% CPU time: 17.380s, 18_036_720 inferences
rebis-dev with up to 5 GB (!) memory consumption and with this time:
?- run.
Task 1: 65601038650482
% CPU time: 20.029s, 16_436_791 inferences
Very important finding, thank you a lot for posting this!
I would greatly appreciate your help with trying to construct a smaller example that still exhibits the issue, so that we can narrow down what exactly is responsible for the problem. For example, is there a specific language construct (DCGs, clpz, constraints, assertz/1 etc., all of which are currently used in the sample program) that you can identify as being particularly relevant to elicit the issue, or conversely: Are there constructs you can remove from the example while still demonstrating the phenomenon? If possible, please try to systematically narrow it down further, to a smaller example that also shows a similar issue. Thank you a lot!
“[A]llow me my rhetorical devices.”
Robert Harper: Boolean Blindness
This specific memory consumption (0.5 GB vs. 5 GB) with tabling has nothing to do with: clpz I removed it/changed to the standard Prolog constructs; DCGs (I removed the input parsing); assertz/1 (the tabled version doesn't use it). Also, I removed the cuts !/0. It seems, anyway, the cuts don't help much when tabling is enabled.
As I see it (and I, almost a non-programmer, can be wrong), in this case, narrowing the code down to a smaller example (without triggering a crash/panicking) that shows the specific memory consumption is not a simple code subtraction.
In context of bug reporting, I don't know what to expect, what I might expect, in a positive way, from tabling in Scryer. Where does Scryer's "normal", be it limited or not very performant, tabling end and bugs begin?
So, at least for now, I don't have a minimal example of this (master 0.5 GB vs. rebis-dev 5 GB) memory behaviour, but a simple tabled recursive definition of Fibonacci is enough to see that rebis-dev eats more giga of RAM than master.
I don't have a minimal example, but I did some amateurish thinking and experiments. In tabling that needs care, what cames first? Memory consumption, some unexpected "things", or "unexpected termination"? A word of warning in the README file might be helpful.
Speaking with design images from Christopher Alexander, in my imagination: Tabling in Scryer isn't a center. It isn't a place (recursively) supported by other centers. Error messages, continuations one can trust, not mentioning documentation. Tabling in Scryer is an isolated island and one can see "things" there. Even dragons, so to speak. My imaginal README file: "Tabling: Help Wanted!"
This "via negativa" is tiresome, but relatively easy, unfortunately.
Segmentation fault.
:- use_module(library(tabling)).
:- table f/1.
f(_).
loop(0).
loop(N) :-
N > 0,
N2 is N - 1,
f(N),
loop(N2).
?- loop(104).
Segmentation fault (core dumped)
Rust panicking: [U]nreachable code. Panicking and "things". This "[...]" is my ellipsis.
?- panic(X).
true
; true
[...]
; true
; X = 0
; X = 1
[...]
; X = 17
; X = 18, _P = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,[]|_A]
; X = 19
; X = 21
; X = 22
; X = 23, _Q = project_attributes/2
; X = 24
; [...]
; X = 33
; X = 34, _P = project_attributes([],[_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,[]|project_attributes([],[_A,_B|...])])
; X = 35, _P = [], _Q = project_attributes/2
; X = 36, _Q = []
; X = 37, _P = [], _Q = project_attributes/2
; X = 38, _P = project_attributes([],[_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,[]|project_attributes([],[_A,_B|...])])
; X = 39
; X = 40
; X = 41
; true
[...]
; true
; X = 1
; X = 2
[...]
; X = 41
; X = 42, _Q = project_attributes/2
; true
[...]
; true
;
thread 'main' panicked at src/machine/unify.rs:435:25:
internal error: entered unreachable code
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Here, rebis-dev is more robust than master, master will panicking sooner:
?- panic(X).
true
; true
[...]
; true
; X = 0
; X = 1
; X = 2
; X = 3
; X = 4
; X = 5
; X = 6
;
thread 'main' panicked at src/machine/machine_state_impl.rs:70:26:
index out of bounds: the len is 160843 but the index is 160844
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
My "triggering" program:
:- use_module(library(tabling)).
:- use_module(library(between)).
:- table f/3.
panic(N) :-
test(41,N) ; test(42,N) ; test(43,N).
test(Max,X) :-
f(1,_,Max);
( f(1,X,Max),
f(0,X,Max) ).
f(0,N,Max) :-
between(0,Max,N).
f(1,N,Max) :-
between(0,Max,N).
No, it's not about between/3, I can write the facts explicitly "by hand" like:
f(0,0).
f(0,1).
f(0,2).
f(0,3).
f(0,4).
f(0,5).
[...]
etc. Though, with a between/3 version one can see "more things", so to speak. And, for example, one doesn't see those "things" if all the facts are programmatically asserted to the Prolog database.
There is more to it. I saw "things". Or, tabling in Scryer doesn't work as, say, tabling in SWI Prolog. And, at this moment, I'm not sure what is meaningful to report. Or, I will be, knowing something like this: Tabling is a center, Scryer's tabling should work exactly as tabling in SWI Prolog. A center. Living.
It's relatively easy, especially "via negativa" like my "negative-testing" examples, to see some "unexpected" things and "things" with tabling, even to trigger a Rust panic or a crash. In my imagination, if not to uncheck this feature (despite everything, as it is now, it can be useful in many situations), a word of warning in the README file might be helpful, and: "Tabling/continuations: Help Wanted!"
The increased memory usage is probably because rebis-dev stores strings in the heap, not in the atom table like master does. master stores partial strings to the atom table, which is never swept, unlike the heap. Because the WAM is a structure copying Prolog VM, all heap terms vulnerable to being swept away after a heap retraction have to be copied fully. For partial strings, this wasn't necessary in master, since each partial string was written to the atom table and once written was a single, permanently valid referent.
What Scryer is missing vis-a-vis SWI is garbage collection, which will be my next major project after the current iteration of rebis-dev and a brief interregnum for bug fixes and quality of life improvements.
@mthom Thank you very much for your response. Being only a naive user, it's, for me, easy to not be aware of those various connections...
Probably thanks to how Rust works with memory (my naive imagination), often, Scryer works great (low memory usage) even without GC, and so it's easy, for me, with Scryer, to forget about (usefulness of) the very idea of GC completely. Of course, Rust plus the programmer's craft.
Thank you very much for all your work...