swipl-devel engine_next crashes http server

Here's a minimal sample where an engine is created, queried and destroyed within the http handler:

:- use_module(library(http/thread_httpd)).
:- use_module(library(http/http_dispatch)).

:- http_handler('/', test, []).

test(_Request) :-
  engine_create(X, foo(X), E),
  engine_next(E, X),
  format("~d", [X]),
  engine_destroy(E).

foo(1).

:- http_server(http_dispatch, [port(5000)]).

When I request http://localhost:5000 now I get the following stacktrace:

C-stack trace labeled "crash":
  [0] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(save_backtrace+0x119) [0x10daf4309]
  [1] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(sigCrashHandler+0xe0) [0x10daf4910]
  [2] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(dispatch_signal+0x22f) [0x10da7407f]
  [3] /usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1a) [0x7fff7c45af5a]
  [5] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(PL_next_solution+0xe6c3) [0x10d9f64e3]
  [6] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(callCleanupHandler+0x364) [0x10da0e054]
  [7] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(PL_next_solution+0x15e4d) [0x10d9fdc6d]
  [8] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(callProlog+0x152) [0x10da52762]
  [9] /Users/esad/swipl-dev/lib/swipl/lib/x86_64-darwin/libswipl.8.dylib(start_thread+0x206) [0x10da8cf36]
  [10] /usr/lib/system/libsystem_pthread.dylib(_pthread_body+0x154) [0x7fff7c4646c1]
  [11] /usr/lib/system/libsystem_pthread.dylib(_pthread_body+0x0) [0x7fff7c46456d]
  [12] /usr/lib/system/libsystem_pthread.dylib(thread_start+0xd) [0x7fff7c463c5d]
Prolog stack:
  [17] time:remove_alarm_notrace/1 [PC=1 in supervisor]
  [16] system:$c_call_prolog/0 [PC=0 in top query clause]
  [15] system:setup_call_catcher_cleanup/4 <no clause>
  [13] time:call_with_time_limit/2 [PC=40 in clause 1]
  [12] http_dispatch:time_limit_action/3 [PC=53 in clause 1]
  [8] httpd_wrapper:call_handler/3 [PC=35 in clause 2]
  [7] system:catch/3 [PC=2 in clause 1]
  [6] httpd_wrapper:handler_with_output_to/5 [PC=24 in clause 1]
  [5] httpd_wrapper:handler_with_output_to/5 [PC=18 in clause 2]
  [4] httpd_wrapper:http_wrapper/5 [PC=155 in clause 1]
  [3] thread_httpd:http_process/4 [PC=64 in clause 1]
  [2] system:catch/3 [PC=2 in clause 1]
  [1] thread_httpd:http_worker/1 [PC=168 in clause 1]
  [0] system:$c_call_prolog/0 [PC=0 in top query clause]
Running on_halt hooks with status 139
Killing 72229 with default signal handlers
Segmentation fault: 11

LLDB dump:

?- time.so was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 72369 stopped
* thread #6, name = 'httpd@5000_4', stop reason = EXC_BAD_ACCESS (code=1, address=0x20)
    frame #0: 0x0000000100612f05 time.so`remove_alarm [inlined] unlinkEvent(ev=0x00000001028f02c0) at time.c:330 [opt]
   327 	    sched->first = ev->next;
   328 	
   329 	  if ( ev->next )
-> 330 	    ev->next->previous = ev->previous;
   331 	
   332 	  ev->next = ev->previous = NULL;	/* in case it's reused */
   333 	}
Target 0: (swipl) stopped.
(lldb) bt
* thread #6, name = 'httpd@5000_4', stop reason = EXC_BAD_ACCESS (code=1, address=0x20)
  * frame #0: 0x0000000100612f05 time.so`remove_alarm [inlined] unlinkEvent(ev=0x00000001028f02c0) at time.c:330 [opt]
    frame #1: 0x0000000100612ecc time.so`remove_alarm [inlined] freeEvent(ev=0x00000001028f02c0) at time.c:338 [opt]
    frame #2: 0x0000000100612ecc time.so`remove_alarm [inlined] removeEvent at time.c:701 [opt]
    frame #3: 0x0000000100612eb3 time.so`remove_alarm(alarm=<unavailable>) at time.c:927 [opt]
    frame #4: 0x00000001000bc4e3 libswipl.8.dylib`PL_next_solution(qid=<unavailable>) at pl-vmi.c:3834 [opt]
    frame #5: 0x00000001000d4054 libswipl.8.dylib`callCleanupHandler(fr=<unavailable>, reason=<unavailable>, __PL_ld=0x0000000101835c00) at pl-wam.c:0 [opt]
    frame #6: 0x00000001000b8427 libswipl.8.dylib`PL_next_solution [inlined] frameFinished(fr=<unavailable>) at pl-wam.c:751 [opt]
    frame #7: 0x00000001000b8408 libswipl.8.dylib`PL_next_solution(qid=<unavailable>) at pl-vmi.c:4256 [opt]
    frame #8: 0x0000000100118762 libswipl.8.dylib`callProlog(module=<unavailable>, goal=<unavailable>, flags=8, ex=0x000070000f107eb0) at pl-pro.c:384 [opt]
    frame #9: 0x0000000100152f36 libswipl.8.dylib`start_thread(closure=0x000000010055d120) at pl-thread.c:1804 [opt]
    frame #10: 0x00007fff7c4646c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #11: 0x00007fff7c46456d libsystem_pthread.dylib`_pthread_start + 377
    frame #12: 0x00007fff7c463c5d libsystem_pthread.dylib`thread_start + 13

Jul 16 '20 17:07 esad

(I encountered this on 8.1.x, but now I upgraded to version 8.3.3-59-g5d6358244 and this still happens)

Jul 16 '20 17:07 esad

Pushed a fix for the clib submodule. The engine destruction destroyed the timers that belong to the original OS thread rather than the engine. That is fixed. An open issue is what should happen with the timeout? Now a possible timeout does not apply to the engine and gets effectuated as the engine_next/2 return control to the original thread. That is sometimes good, notably if you use engines to communicate between threads, and sometimes bad (when dealing with an engine only supporting the current thread and optionally doing slow computations).

Jul 17 '20 07:07 JanWielemaker

(Note that when engine_destroy/1 is removed, the process still crashes, I guess it's getting garbage collected?)

I'm not familiar with engines or http code enough (yet :-) so you may need to explain the question. Which timeouts are you referring to? The way I understand it is that engines have no timeouts (coroutines) but http handlers do. So when an engine is taking time and triggering a timeout, shouldn't it be destroyed from the outside, by some kind of http server watchdog thread?

Btw. I think this might be the same bug as reported in https://swi-prolog.discourse.group/t/http-engines-and-timeouts/1631

Jul 17 '20 11:07 esad

(Note that when engine_destroy/1 is removed, the process still crashes, I guess it's getting garbage collected?)

Doesn't reproduce.

shouldn't it be destroyed from the outside, by some kind of http server watchdog thread?

That is not entirely clear. If the engine solely works for the involved HTTP handler, probably that would be best. Engines can also be used to share information between threads though and in that case you probably do not want it to be killed.

Jul 17 '20 13:07 JanWielemaker

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/http-engines-and-timeouts/1631/4

Jul 17 '20 13:07 JanWielemaker

swipl-devel swipl-devel copied to clipboard

engine_next crashes http server

swipl-devel
swipl-devel copied to clipboard