scryer-prolog icon indicating copy to clipboard operation
scryer-prolog copied to clipboard

Segmentation fault with meta-interpreter

Open josd opened this issue 3 years ago • 14 comments

While testing the meta-interpreter so nicely explained by @triska at https://www.youtube.com/watch?v=nmBkU-l1zyc&t=1870s the following gives an unexpected segmentation fault:

$ scryer-prolog
?- [user].
mi([]).
mi([G|Gs]) :-
    head_body_(G,Goals,Gs),
    mi(Goals).

head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).

head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
    head_body_(Head,Goals0,Goals).

head_body_(natnum(0),Rs,Rs).
head_body_(natnum(s(X)),[natnum(X)|Rs],Rs).

?- mi([mi([mi([mi([mi([natnum(X)])])])])]).
   X = 0
;  X = s(0)
;  X = s(s(0))
;  X = s(s(s(0)))
;  X = s(s(s(s(0))))
;  ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])]).
   X = 0
;  X = s(0)
;  X = s(s(0))
;  X = s(s(s(0)))
;  X = s(s(s(s(0))))
;  ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])]).
   X = 0
;  X = s(0)
;  X = s(s(0))
;  X = s(s(s(0)))
;  X = s(s(s(s(0))))
;  ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])]).
   X = 0
;  X = s(0)
;  X = s(s(0))
;  X = s(s(s(0)))
;  X = s(s(s(s(0))))
;  ...
?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])])])])])])]).
Segmentation fault

It happens when 9 GB of memory is consumed and my system has 12 GB memory plus 4 GB swap.

josd avatar Oct 22 '21 18:10 josd

Thanks, it should make a good test case for the GC.

mthom avatar Oct 23 '21 04:10 mthom

Very well.

josd avatar Oct 29 '21 20:10 josd

@josd: Could you please consider leaving this open, since the issue still exists?

triska avatar Oct 29 '21 21:10 triska

Fair enough @triska and will stay tuned!

josd avatar Oct 29 '21 21:10 josd

At this moment with

$ rustc --version
rustc 1.57.0 (f1edd0429 2021-11-29)
$ scryer-prolog --version
"v0.8.123-732-g8065889"

the issue is still the same, but take your time.

josd avatar Dec 05 '21 22:12 josd

@josd: Do you know where this happens exactimissiter? The reason I am asking is that there was something kind of related with SWI, right? And there, it was the writing which consumed too much space. Instead of making writing tail recursive, they (well at least) just added a clean overflow which is certainly also a good idea but constant space usage is still preferable to an overflow.

UWN avatar Dec 06 '21 09:12 UWN

@UWN you have a good memory! There is indeed a stack overflow issue with SWI for a similar program

$ cat mif8.pl
mi([]).
mi([G|Gs]) :-
    head_body_(G,Goals,Gs),
    mi(Goals).

head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).

head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
    head_body_(Head,Goals0,Goals).

head_body_(factorial(0,s(0)),Rs,Rs).
head_body_(factorial(s(N),F),[factorial(N,F1),prod(s(N),F1,F)|Rs],Rs).

head_body_(prod(0,_,0),Rs,Rs).
head_body_(prod(s(N),M,P),[prod(N,M,K),sum(K,M,P)|Rs],Rs).

head_body_(sum(0,M,M),Rs,Rs).
head_body_(sum(s(N),M,s(K)),[sum(N,M,K)|Rs],Rs).

run :-
    mi([mi([mi([mi([mi([factorial(s(s(s(s(s(s(s(s(0)))))))),Y)])])])])]),
    write(Y),
    nl.

where we get

$ swipl -g run,halt mif8.pl
...
s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(
ERROR: -g run,halt: write/1: C-stack limit (8,388,608 bytes) exceeded.
ERROR: Use the shell command ulimit -s size to enlarge the limit.

and Jan replied to that at https://swi-prolog.discourse.group/t/swi-does-some-good-garbage-collection-optimisation-these-days/4588/25?u=josderoo

Now for this particular case Scryer works fine and $ scryer-prolog -g run,halt mif8.pl gives mif8.txt

However, for the next factorial

$ cat mif9.pl
mi([]).
mi([G|Gs]) :-
    head_body_(G,Goals,Gs),
    mi(Goals).

head_body_(mi([]),Rs,Rs).
head_body_(mi([G|Gs]),[head_body_(G,Goals,Gs),mi(Goals)|Rs],Rs).

head_body_(head_body_(Head,Goals0,Goals),Rs,Rs) :-
    head_body_(Head,Goals0,Goals).

head_body_(factorial(0,s(0)),Rs,Rs).
head_body_(factorial(s(N),F),[factorial(N,F1),prod(s(N),F1,F)|Rs],Rs).

head_body_(prod(0,_,0),Rs,Rs).
head_body_(prod(s(N),M,P),[prod(N,M,K),sum(K,M,P)|Rs],Rs).

head_body_(sum(0,M,M),Rs,Rs).
head_body_(sum(s(N),M,s(K)),[sum(N,M,K)|Rs],Rs).

run :-
    mi([mi([mi([mi([mi([factorial(s(s(s(s(s(s(s(s(s(0))))))))),Y)])])])])]),
    write(Y),
    nl.

it segfaults again

$ scryer-prolog -g run,halt mif9.pl
Segmentation fault

and it is at about the same amount of memory consumption i.e. 9GB out of the 12 GB

$ ps -afux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0    920   532 ?        Sl   Dec02   0:00 /init
root        95  0.0  0.0    900    80 ?        Ss   Dec02   0:00 /init
root        96  0.0  0.0    900    80 ?        S    Dec02   0:10  \_ /init
jdroo       97  0.0  0.0  13372  9116 pts/1    Ss   Dec02   0:07      \_ -bash
jdroo    13292 98.4 74.4 9727572 9612080 pts/1 R+   11:46   1:01      |   \_ scryer-prolog -g run,halt mif9.pl

I don't think it is the work of OOM killer (Out Of Memory) and this is what I see with valgrind

josd avatar Dec 06 '21 11:12 josd

@UWN you have a good memory!

It depends, I have according to grep mT /proc/meminfo

MemTotal:         3635504 kB    p0, Ubuntu 14.04.6 LTS
MemTotal:         8042908 kB    TU, Ubuntu 18.04.5 LTS
MemTotal:       131815096 kB    g1, Ubuntu 21.10

So I naturally have some difficulties reproducing your findings.

Could you reformulate your query such that the actual problem is better accessible? Reproduction depends on the very precise memory configurations. With 128G it just takes much too much time...

ulrich@p0:~/scryer$ ulimit -v 100000
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?- 

real	0m11.128s
user	0m2.930s
sys	0m0.289s
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?- [mif8].
   true.
?- run.
Segmentation fault (core dumped)

real	0m13.405s
user	0m3.581s
sys	0m0.365s
ulrich@p0:~/scryer

Is this the problem already? As you can see, it took less than 4s to reproduce this problem on a very old laptop that now serves rather as a labtop. And without interrupting any videos...

I guess, your problem can be narrowed down to:

ulrich@p0:~/scryer$ ulimit -v
100000
ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f
?- [user].
:- use_module(library(lambda)).
:- use_module(library(lists),[length/2]).
:- use_module(library(time)).
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).


?- {I,N}+\ ( open('/dev/null',write,S), length(_,I), I > 5, N is 2^I, n_tosx(N,SX), time(write(S,SX)) ).

   % CPU time: 0.030 seconds
   I = 6, N = 64
;  % CPU time: 0.042 seconds
   I = 7, N = 128
;  % CPU time: 0.038 seconds
   I = 8, N = 256
;  % CPU time: 0.038 seconds
   I = 9, N = 512
;  % CPU time: 0.042 seconds
   I = 10, N = 1024
;  % CPU time: 0.047 seconds
   I = 11, N = 2048
;  % CPU time: 0.058 seconds
   I = 12, N = 4096
;  % CPU time: 0.093 seconds
   I = 13, N = 8192
;  % CPU time: 0.135 seconds
   I = 14, N = 16384
;  % CPU time: 0.239 seconds
   I = 15, N = 32768
;  Segmentation fault (core dumped)

real	2m8.609s
user	0m8.587s
sys	0m0.637s

In this manner the problem remains reproducible, even if the actual memory usage of Scryer changes

UWN avatar Dec 08 '21 06:12 UWN

Thanks @UWN and that's indeed a better reproduction of the the issue. It also clearly shows that it is a failure to allocate memory:

$ ulimit -v
100000

$ cat n_tosx.pl
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).

$ scryer-prolog n_tosx.pl
?- n_tosx(10,X).
   X = s(s(s(s(s(s(s(s(s(s(0)))))))))).
?- n_tosx(100,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(10000,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(100000,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000000,X).
memory allocation of 8388608 bytes failed
Aborted

josd avatar Dec 08 '21 10:12 josd

Could it be that you have some silent .scryerrc? I get always slightly different results, like

ulrich@p0:~/scryer$ cat n_tosx.pl 
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).

ulrich@p0:~/scryer$ time /opt/gupu/scryer-prolog/target/release/scryer-prolog -f n_tosx.pl
?- n_tosx(10,X).
   X = s(s(s(s(s(s(s(s(s(s(0)))))))))).
?- n_tosx(100,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(1000,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(10000,X).
   X = s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(...))))))))))))))))))))).
?- n_tosx(100000,X).
   X = memory allocation of 3670016 bytes failed
Aborted (core dumped)

real	0m30.474s
user	0m4.678s
sys	0m0.366s
ulrich@p0:~/scryer$ ulimit -v
100000

UWN avatar Dec 08 '21 16:12 UWN

Who is the culprit here? Is it a memory overflow because of the large heap term. Thus a "regular" unhandled overflow like #16 or is it because of a stackoverflow in write?

UWN avatar Dec 08 '21 16:12 UWN

Am not using .scryerrc

$ cat ~/.scryerrc
cat: /home/jdroo/.scryerrc: No such file or directory

At least the failure also occurs without write/1

$ ulimit -v
100000

$ cat n_tosx.pl
n_tosx(N1,s(X)) :- N1 > 0, !, N2 is N1-1, n_tosx(N2,X).
n_tosx(0,0).

$ scryer-prolog -g "n_tosx(200000,X),halt" n_tosx.pl
memory allocation of 8388608 bytes failed
Aborted

and it is again the same message memory allocation of 8388608 bytes failed as in https://github.com/mthom/scryer-prolog/issues/1057#issuecomment-988699925 which is indeed different from your memory allocation of 3670016 bytes failed

josd avatar Dec 08 '21 17:12 josd

With rebis-dev there is now a better message:

?- mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([mi([natnum(X)])])])])])])])])])])])])])])])])])])])])])])])])])])]).
memory allocation of 17179869184 bytes failed
Aborted

josd avatar Jan 15 '22 14:01 josd

Currently, no error for:

ulrich@p0:~/ftp/scryer-prolog$ ulimit -v
100000
ulrich@p0:~/ftp/scryer-prolog$ /opt/gupu/scryer-prolog/target/release/scryer-prolog -f -g "n_tosx(476000,X),halt" n_tosx.pl

UWN avatar Jan 15 '22 19:01 UWN