fibers
fibers copied to clipboard
Memory leak on choice operation of 'get' and 'sleep'
The code below exhibits a memory leak with Fibers 1.3.1:
(use-modules (fibers)
(fibers channels)
(fibers operations)
(fibers timers)
(ice-9 textual-ports)
(ice-9 match)
(rnrs bytevectors))
(include "heap-profiler.scm")
(sigaction SIGINT
(lambda _
(profile-heap)
(primitive-exit 0)))
(setvbuf (current-output-port) 'none)
(run-fibers
(lambda ()
(define channel
(make-channel))
(spawn-fiber
(lambda ()
(let loop ()
(put-message channel 'hello)
(sleep 0.0001)
(loop))))
(spawn-fiber
(lambda ()
(let loop ()
(format #t "~%heap: ~a MiB~%"
(round
(/ (assoc-ref (gc-stats) 'heap-size) (expt 2. 20))))
(sleep 1)
(loop))))
(let loop ()
(perform-operation
(choice-operation (wrap-operation (sleep-operation 100)
(const 'timeout))
(get-operation channel)))
(loop)))
#:hz 0
#:parallelism 1)
(The heap-profiler.scm file comes from here.)
A typical run looks like this:
$ guile fibers-leak.scm
WARNING: (guile-user): imported module (fibers) overrides core binding `sleep'
heap: 4.0 MiB
heap: 7.0 MiB
heap: 9.0 MiB
heap: 12.0 MiB
heap: 16.0 MiB
heap: 16.0 MiB
heap: 22.0 MiB
heap: 21.0 MiB
heap: 22.0 MiB
heap: 29.0 MiB
heap: 29.0 MiB
heap: 29.0 MiB
heap: 37.0 MiB
heap: 37.0 MiB
heap: 37.0 MiB
C-c C-c
;;; (#<procedure heap-sections ()>)
;;; (sampling 100000)
% type self avg obj size
54.4 program 1,565,456 32.1
19.8 pair 571,072 16.0
8.0 unknown 229,216 1776.9
7.0 struct 201,056 51.6
2.5 partial-continuation 70,816 32.0
2.0 pointer 56,560 16.0
1.8 vector 52,363 50.8
0.8 stringbuf 23,744 68.2
0.8 vm-continuation 23,152 17.3
0.7 symbol 21,408 32.0
0.7 atomic-box 20,640 17.2
0.4 bytevector 11,120 358.7
0.2 variable 6,608 20.5
0.2 string 6,080 32.0
0.2 smob 5,344 32.0
0.2 heap-number 4,496 30.2
0.1 hash-table 4,256 32.0
0.1 weak-table 3,600 28.8
0.0 primitive 768 16.0
0.0 weak-vector 384 19.2
0.0 bitvector 224 37.3
0.0 keyword 192 16.0
0.0 dynamic-state 176 16.0
0.0 primitive-generic 96 32.0
0.0 fluid 48 16.0
0.0 port 32 32.0
sampled heap: 2.74554 MiB (heap size: 36.64453 MiB)
This suggests delimited continuations are being accumuated somewhere. It's all fine when commenting out the sleep operation.
This is because the "block" function of timer-operation calls schedule-task-at-time, adding a thunk to the scheduler's queue, and there's nothing that undoes this when the operation "loses" in a choice operation.
Eventually that thunk gets scheduled and removed from the run queue, but that might take too long and cause the memory leak we're seeing.
In the example above, changing from (sleep-operation 100) to (sleep-operation 1) makes a big difference: heap size quickly stabilizes because those timers are removed every second, which is fast enough.
It seems to me that there needs to be a way to evacuate those sleep-operation timers explicitly when a sleep operation is "canceled", to avoid that problem, but operations don't have a "cancel" function. Thoughts? Cc: @wingo
(Assigning the ... assignment based on what's happening)
This is because the "block" function of
timer-operationcallsschedule-task-at-time, adding a thunk to the scheduler's queue, and there's nothing that undoes this when the operation "loses" in a choice operation.…
It seems to me that there needs to be a way to evacuate those
sleep-operationtimers explicitly when a sleep operation is "canceled", to avoid that problem, but operations don't have a "cancel" function. Thoughts? Cc: @wingo
This is what the unique-to-Concurrent ML basic combinator withNack (spelled nack-guard-evt in Racket) is supposed to provide!
See https://github.com/wingo/fibers/issues/97 and Reppy, Concurrent Programming in ML § 6.5.