scryer-prolog
scryer-prolog copied to clipboard
Segmentation fault with meta-interpreter
While testing the meta-interpreter so nicely explained by @triska at https://www.youtube.com/watch?v=nmBkU-l1zyc&t=1870s the following gives an unexpected segmentation fault:
$ scryer-prolog
?- [user].
mi([]).
mi([G|Gs]) :-
head_body_(G,Goals,Gs),
mi(Goals).
head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).
head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
head_body_(Head,Goals0,Goals).
head_body_(natnum(0),Rs,Rs).
head_body_(natnum(s(X)),[natnum(X)|Rs],Rs).
?- mi([mi([mi([mi([mi([natnum(X)])])])])]).
X = 0
; X = s(0)
; X = s(s(0))
; X = s(s(s(0)))
; X = s(s(s(s(0))))
; ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])]).
X = 0
; X = s(0)
; X = s(s(0))
; X = s(s(s(0)))
; X = s(s(s(s(0))))
; ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])]).
X = 0
; X = s(0)
; X = s(s(0))
; X = s(s(s(0)))
; X = s(s(s(s(0))))
; ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])]).
X = 0
; X = s(0)
; X = s(s(0))
; X = s(s(s(0)))
; X = s(s(s(s(0))))
; ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])])])])])])]).
Segmentation fault
It happens when 9 GB of memory is consumed and my system has 12 GB memory plus 4 GB swap.
Thanks, it should make a good test case for the GC.
Very well.
@josd: Could you please consider leaving this open, since the issue still exists?
Fair enough @triska and will stay tuned!
At this moment with
$ rustc --version
rustc 1.57.0 (f1edd0429 2021-11-29)
$ scryer-prolog --version
"v0.8.123-732-g8065889"
the issue is still the same, but take your time.
@josd: Do you know where this happens exactimissiter? The reason I am asking is that there was something kind of related with SWI, right? And there, it was the writing which consumed too much space. Instead of making writing tail recursive, they (well at least) just added a clean overflow which is certainly also a good idea but constant space usage is still preferable to an overflow.
@UWN you have a good memory! There is indeed a stack overflow issue with SWI for a similar program
$ cat mif8.pl
mi([]).
mi([G|Gs]) :-
head_body_(G,Goals,Gs),
mi(Goals).
head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).
head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
head_body_(Head,Goals0,Goals).
head_body_(factorial(0,s(0)),Rs,Rs).
head_body_(factorial(s(N),F),[factorial(N,F1),prod(s(N),F1,F)|Rs],Rs).
head_body_(prod(0,_,0),Rs,Rs).
head_body_(prod(s(N),M,P),[prod(N,M,K),sum(K,M,P)|Rs],Rs).
head_body_(sum(0,M,M),Rs,Rs).
head_body_(sum(s(N),M,s(K)),[sum(N,M,K)|Rs],Rs).
run :-
mi([mi([mi([mi([mi([factorial(s(s(s(s(s(s(s(s(0)))))))),Y)])])])])]),
write(Y),
nl.
where we get
$ swipl -g run,halt mif8.pl
...
s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(
ERROR: -g run,halt: write/1: C-stack limit (8,388,608 bytes) exceeded.
ERROR: Use the shell command ulimit -s size to enlarge the limit.
and Jan replied to that at https://swi-prolog.discourse.group/t/swi-does-some-good-garbage-collection-optimisation-these-days/4588/25?u=josderoo
Now for this particular case Scryer works fine and
$ scryer-prolog -g run,halt mif8.pl
gives mif8.txt
However, for the next factorial
$ cat mif9.pl
mi([]).
mi([G|Gs]) :-
head_body_(G,Goals,Gs),
mi(Goals).
head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).
head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
head_body_(Head,Goals0,Goals).
head_body_(factorial(0,s(0)),Rs,Rs).
head_body_(factorial(s(N),F),[factorial(N,F1),prod(s(N),F1,F)|Rs],Rs).
head_body_(prod(0,_,0),Rs,Rs).
head_body_(prod(s(N),M,P),[prod(N,M,K),sum(K,M,P)|Rs],Rs).
head_body_(sum(0,M,M),Rs,Rs).
head_body_(sum(s(N),M,s(K)),[sum(N,M,K)|Rs],Rs).
run :-
mi([mi([mi([mi([mi([factorial(s(s(s(s(s(s(s(s(s(0))))))))),Y)])])])])]),
write(Y),
nl.
it segfaults again
$ scryer-prolog -g run,halt mif9.pl
Segmentation fault
and it is at about the same amount of memory consumption i.e. 9GB out of the 12 GB
$ ps -afux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 920 532 ? Sl Dec02 0:00 /init
root 95 0.0 0.0 900 80 ? Ss Dec02 0:00 /init
root 96 0.0 0.0 900 80 ? S Dec02 0:10 \_ /init
jdroo 97 0.0 0.0 13372 9116 pts/1 Ss Dec02 0:07 \_ -bash
jdroo 13292 98.4 74.4 9727572 9612080 pts/1 R+ 11:46 1:01 | \_ scryer-prolog -g run,halt mif9.pl
I don't think it is the work of OOM killer (Out Of Memory) and this is what I see with valgrind
@UWN you have a good memory!
It depends, I have according to grep mT /proc/meminfo
MemTotal: 3635504 kB p0, Ubuntu 14.04.6 LTS
MemTotal: 8042908 kB TU, Ubuntu 18.04.5 LTS
MemTotal: 131815096 kB g1, Ubuntu 21.10
So I naturally have some difficulties reproducing your findings.
Could you reformulate your query such that the actual problem is better accessible? Reproduction depends on the very precise memory configurations. With 128G it just takes much too much time...
ulrich@p0:~/scryer$ ulimit -v 100000
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?-
real 0m11.128s
user 0m2.930s
sys 0m0.289s
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?- [mif8].
true.
?- run.
Segmentation fault (core dumped)
real 0m13.405s
user 0m3.581s
sys 0m0.365s
ulrich@p0:~/scryer
Is this the problem already? As you can see, it took less than 4s to reproduce this problem on a very old laptop that now serves rather as a labtop. And without interrupting any videos...
I guess, your problem can be narrowed down to:
ulrich@p0:~/scryer$ ulimit -v
100000
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?- [user].
:- use_module(library(lambda)).
:- use_module(library(lists),[length/2]).
:- use_module(library(time)).
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).
?- {I,N}+\ ( open('/dev/null',write,S), length(_,I), I > 5, N is 2^I, n_tosx(N,SX), time(write(S,SX)) ).
% CPU time: 0.030 seconds
I = 6, N = 64
; % CPU time: 0.042 seconds
I = 7, N = 128
; % CPU time: 0.038 seconds
I = 8, N = 256
; % CPU time: 0.038 seconds
I = 9, N = 512
; % CPU time: 0.042 seconds
I = 10, N = 1024
; % CPU time: 0.047 seconds
I = 11, N = 2048
; % CPU time: 0.058 seconds
I = 12, N = 4096
; % CPU time: 0.093 seconds
I = 13, N = 8192
; % CPU time: 0.135 seconds
I = 14, N = 16384
; % CPU time: 0.239 seconds
I = 15, N = 32768
; Segmentation fault (core dumped)
real 2m8.609s
user 0m8.587s
sys 0m0.637s
In this manner the problem remains reproducible, even if the actual memory usage of Scryer changes
Thanks @UWN and that's indeed a better reproduction of the the issue. It also clearly shows that it is a failure to allocate memory:
$ ulimit -v
100000
$ cat n_tosx.pl
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).
$ scryer-prolog n_tosx.pl
?- n_tosx(10,X).
X = s(s(s(s(s(s(s(s(s(s(0)))))))))).
?- n_tosx(100,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(10000,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(100000,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000000,X).
memory allocation of 8388608 bytes failed
Aborted
Could it be that you have some silent .scryerrc
? I get always slightly different results, like
ulrich@p0:~/scryer$ cat n_tosx.pl
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f n_tosx.pl
?- n_tosx(10,X).
X = s(s(s(s(s(s(s(s(s(s(0)))))))))).
?- n_tosx(100,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(10000,X).
X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(100000,X).
X = memory allocation of 3670016 bytes failed
Aborted (core dumped)
real 0m30.474s
user 0m4.678s
sys 0m0.366s
ulrich@p0:~/scryer$ ulimit -v
100000
Who is the culprit here? Is it a memory overflow because of the large heap term. Thus a "regular" unhandled overflow like #16 or is it because of a stackoverflow in write?
Am not using .scryerrc
$ cat ~/.scryerrc
cat: /home/jdroo/.scryerrc: No such file or directory
At least the failure also occurs without write/1
$ ulimit -v
100000
$ cat n_tosx.pl
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).
$ scryer-prolog -g "n_tosx(200000,X),halt" n_tosx.pl
memory allocation of 8388608 bytes failed
Aborted
and it is again the same message memory allocation of 8388608 bytes failed
as in https://github.com/mthom/scryer-prolog/issues/1057#issuecomment-988699925
which is indeed different from your memory allocation of 3670016 bytes failed
With rebis-dev
there is now a better message:
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])])])])])])])])]).
memory allocation of 17179869184 bytes failed
Aborted
Currently, no error for:
ulrich@p0:~/ftp/scryer-prolog$ ulimit -v
100000
ulrich@p0:~/ftp/scryer-prolog$ /opt/gupu/scryer-prolog/target/release/scryer-prolog -f -g "n_tosx(476000,X),halt" n_tosx.pl