cl-async
cl-async copied to clipboard
Don't use T or CONDITION in HANDLER-CASE or non-local-exiting HANDLER-BIND clauses (also affects blackbird)
I've noticed there's repeating pattern of
(handler-case ... (t () ..)) and (handler-case ... (condition () ...))
in cl-async & blackbird code. Unfortunatelly this may lead to unwanted
non-local exits and some very hard-to-debug problems.
The problem is, not every condition in Common Lisp is necessarily causing stack unwinding. For instance, WARN is implemented using conditions, too. HANDLER-BIND can be used to handle conditions without causing the non-local exit:
CL-USER> (handler-bind ((warning
#'(lambda (w)
(format t "got warning: ~s~%" w))))
(progn
(warn "test")
(princ "zzz")))
got warning: #<SIMPLE-WARNING "test" {100C7B5883}>
WARNING: test
zzz
WARN here signals a SIMPLE-WARNING condition which still allows
(princ "zzz") form to be evaluated.
HANDLER-CASE is usually implemented internally using HANDLER-BIND or
some internal variant of it. It's important difference is that any
condition handled by HANDLER-CASE always causes a non-local exit
(stack unwinding).
SIGNAL function is the fundamental mechanism which is used to signal
conditions. ERROR function is implemented in terms of SIGNAL, but
invokes the debugger in case the condition wasn't handled with a
non-local exit.
There's a separate superclass for conditions that are designed to
cause non-local exit, SERIOUS-CONDITION. It's subclasses are ERROR and
some other conditions that shouldn't be considered plain errors that
are handled as usual, such as stack overflow errors. For instance, in
SBCL:
CL-USER> (mapcar #'class-name (sb-pcl:class-direct-subclasses (find-class 'serious-condition)))
(SB-DI:DEBUG-CONDITION SB-SYS:INTERACTIVE-INTERRUPT TIMEOUT STORAGE-CONDITION
ERROR)
CL-USER> (mapcar #'class-name (sb-pcl:class-direct-subclasses (find-class 'storage-condition)))
(SB-KERNEL::HEAP-EXHAUSTED-ERROR SB-KERNEL::ALIEN-STACK-EXHAUSTED
SB-KERNEL::BINDING-STACK-EXHAUSTED
SB-KERNEL::CONTROL-STACK-EXHAUSTED
SB-INT:SIMPLE-STORAGE-CONDITION)
(TIMEOUT here is related to SB-EXT:WITH-TIMEOUT).
Here lies the problem: handling every condition using HANDLER-CASE causes
non-local exit even in unwanted cases. For example:
CL-USER> (handler-case
(progn
(warn "test")
(princ "zzz"))
(condition ()
(princ "oops")))
oops
Note that zzz isn't printed, although WARN isn't normally supposed
to alter the normal program flow. Same goes for T:
CL-USER> (handler-case
(progn
(warn "test")
(princ "zzz"))
(t ()
(princ "oops")))
oops
On the other hand:
CL-USER> (handler-case
(progn
(warn "test")
(princ "zzz"))
(error ()
(princ "oops")))
WARNING: test
zzz
Specifying SERIOUS-CONDITION instead of ERROR may not always be a
right option, as you probably don't want to wrap things like stack
overflows in normal cl-async or blackbird error handling mechanisms.
This problem affects both blackbird and cl-async. As of
cl-async, I'm currently readying some important set of changes (see
#107) that would be nice to have in cl-async so fixing the problem
right now may probably cause some merge conflicts, so probably I should fix it myself as the part of these changes (if you're willing to integrate them).
See also (besides CLHS): http://www.nhplace.com/kent/Papers/Condition-Handling-2001.html http://www.gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html
Thanks, you are not the first person to bring this up, and it points to a lack of understanding in condition handling on my part (see https://github.com/orthecreedence/blackbird/issues/8). I have read about condition handling and also specifically handler-bind and its differences from handler-case and how to use it more effectively, but my general understanding is still fairly limited. This is something I am more than willing to fix in the libraries so that they are more correct/useful for other people's typical workflows.
My error handling background typically consists of try {} catch () {} so some of the more nuanced aspects of condition handling in CL have been hard for me to grasp (but mainly I have not really done the research yet because I did not know it was a problem in the first place).
I'll review the links you sent over and also would just appreciate any feedback (or pull requests) you have on how to make the condition handling more appropriate.
Ok, biggish updates with regard to error handling here. cl-async no longer catches errors and passes them to the nearest event-cb. The only conditions passed to event-cb for any operation are actual cl-async conditions (streamish-eof, socket-reset, ...) spawned by libuv.
When using :catch-app-errors t, all non-cl-async errors are trapped and sent to the callback given under :caught-errors:
(as:with-event-loop (:catch-app-errors t :caught-errors (lambda (err) ...))
...)
This allows for a production mode of sorts where a rogue error won't crash the event loop entirely, just the "thread" that the error occurred on.
When using :catch-app-errors nil (the default), all non-cl-async conditions are allowed to pass, unfettered, to the top-level (and enter the debugger in the case of errors). This allows the use of restarts as well as interactive debugging of errors, with full backtraces.
Also, I added a continue-event-loop restart to cl-async-util::call-with-callback-restarts that passes the error to :caught-errors if invoked. This lets you achieve :catch-app-errors t behavior on an error-by-error basis.
Also, exporting abort-callback, continue-event-loop, exit-event-loop restarts from cl-async-util (but not cl-async) for easier auto restart invocation.