woo
woo copied to clipboard
Can't catch some errors thrown from worker threads.
One common error that happens is when a client opens a connection but never sends a request. Eventually, that produces a timeout error, which bubbles up to the debugger. But that's hard to reproduce. Here's an easier one, generated by simply sending invalid output:
(woo:run (lambda (env)
`(200 (:content-type "application/octet-stream")
(#(1 2 3 4 5))))
:num-workers 2)
The FAST-HTTP.ERROR:CB-MESSAGE-COMPLETE this throws isn't catchable anywhere because the error happens in a worker thread. It has nowhere to go but the debugger. This is pretty convenient for development, but a deal-killer for production code.
The woo:run
function could accept an error-handling function that the worker thread would install
with HANDLER-BIND, like this:
(defun make-worker (process-fn when-died error-handler)
(let* ((dequeue-async (cffi:foreign-alloc '(:struct lev:ev-async)))
(stop-async (cffi:foreign-alloc '(:struct lev:ev-async)))
(worker (%make-worker :dequeue-async dequeue-async
:stop-async stop-async
:process-fn process-fn))
(worker-lock (bt:make-lock)))
(lev:ev-async-init dequeue-async 'worker-dequeue)
(lev:ev-async-init stop-async 'worker-stop)
(setf (worker-thread worker)
(bt:make-thread
(lambda ()
(tagbody
begin
(restart-case
(handler-bind ((t error-handler))
(bt:acquire-lock worker-lock)
(let ((*worker* worker))
(wev:with-sockaddr
(unwind-protect
(wev:with-event-loop ()
(setf (worker-evloop worker) *evloop*)
(bt:release-lock worker-lock)
(lev:ev-async-start *evloop* dequeue-async)
(lev:ev-async-start *evloop* stop-async))
(unless (eq (worker-status worker) :stopping)
(vom:debug "[~D] Worker has died" (worker-id worker))
(funcall when-died worker))
(finalize-worker worker)
(vom:debug "[~D] Bye." (worker-id worker))))))
(abort-worker-thread () :report "Abort the Woo worker")
(restart-worker () :report "Restart the worker"
(go begin)))))
:initial-bindings (default-thread-bindings)
:name "woo-worker"))
(sleep 0.1)
(bt:acquire-lock worker-lock)
worker))
Then, the app's error handler could invoke the abort-worker-thread
restart to
abort the worker thread, or restart-worker
to start the worker up again.
It turns out that you don't have to be using :num-workers
to be unable to catch an error. It is also possible to be unable to catch errors in single-threaded operation.