plz.el
plz.el copied to clipboard
Emacs process sentinel-related issues
See also:
- https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50166
- https://github.com/alphapapa/ement.el/issues/38
@niklaseklund If you are encountering this problem or one related to it, please provide more information to improve the chances of solving it.
FYI, to anyone reading this, the issues mentioned here don't seem to be problematic in practice. Ement.el, plz
's primary user so far, is now on ELPA and used by at least a few hundred people without any apparent problems.
Small update: In looking at CI results lately, it seems that the current Emacs snapshot version may have improved on this problem slightly in that sometimes all of the tests pass (however, note that the linting pass fails due to changes in the cl-labels
form's indentation, and that makes the whole CI job appear to fail even when the test phase passes). But it does not appear to be completely solved, because sometimes some of the tests still appear to randomly fail. So maybe some kind of progress has been made in the mysterious process.c
...
Have just had a possible insight into this problem: In the following source code:
(defun hyperdrive-entry-at (version entry)
"Return ENTRY at its hyperdrive's VERSION, or nil if not found."
;; Use `hyperdrive-copy-tree', because `copy-tree' doesn't work on
;; records/structs, and `copy-hyperdrive-entry' doesn't copy deeply,
;; and we need to be able to modify the `etc' alist of the copied
;; entry separately.
(let ((entry (hyperdrive-copy-tree entry t)))
(setf (alist-get 'with-version-p (hyperdrive-entry-etc entry)) (and version t))
(when version
(setf (hyperdrive-entry-version entry) version))
(condition-case err
(hyperdrive-fill entry :then 'sync)
(plz-http-error
(pcase (plz-response-status (plz-error-response (caddr err)))
(404 nil)
(_ (signal (car err) (cdr err))))))))
(cl-defun hyperdrive-fill
(entry &key queue then
(else (lambda (plz-error)
;; FIXME: Use a message instead of a warning for
;; now, because the 404 errors for filenames with
;; spaces are annoying as warnings.
(hyperdrive-message (format "hyperdrive-fill: error: %S" plz-error) )
;; (display-warning 'hyperdrive
;; (format "hyperdrive-fill: error: %S" plz-error))
)))
"Fill ENTRY's metadata and call THEN.
If THEN is `sync', return the filled entry and ignore ELSE.
Otherwise, make request asynchronously and call THEN with the
filled entry; or if request fails, call ELSE (which is passed to
`hyperdrive-api', which see. If QUEUE, make the fill request in
the given `plz-queue'"
(declare (indent defun))
(pcase then
('sync (hyperdrive--fill entry
(plz-response-headers
(with-local-quit
(hyperdrive-api 'head (hyperdrive-entry-url entry)
:as 'response
:then 'sync
:noquery t)))))
(_ (hyperdrive-api 'head (hyperdrive-entry-url entry)
:queue queue
:as 'response
:then (lambda (response)
(funcall then (hyperdrive--fill entry (plz-response-headers response))))
:else else
:noquery t))))
When stepping through hyperdrive-entry-at
with Edebug, the code works correctly: the condition-case
catches the error signaled from the plz--sentinel
sentinel function. But after re-evaluating the definition of hyperdrive-entry-at
, the error signaled from the sentinel is not caught by the condition-case
, and the signal propagates all the way up, resulting in this error:
error in process sentinel: plz: HTTP error: "plz--sentinel: HTTP error", #s(plz-error nil #s(plz-response 1.1 404 ((content-type . "text/plain;charset=UTF-8") (date . "Tue, 18 Apr 2023 21:31:07 GMT") (connection . "keep-alive") (keep-alive . "timeout=5")) "
Process plz-request-curl stderr finished
") nil)
(Note the Process plz-request-curl stderr finished
junk, for which a solution is being attempted in this branch: https://github.com/alphapapa/plz.el/tree/wip/no-mix-stderr
Today, thanks to @josephmturner's help, I finally put together this puzzle piece from the Info page (elisp)Sentinels
:
If an error happens during execution of a sentinel, it is caught automatically, so that it doesn’t stop the execution of whatever programs was running when the sentinel was started. However, if ‘debug-on-error’ is non-‘nil’, errors are not caught.
I wrote a post to emacs-devel about the issue, asking for guidance and whether Emacs should be patched to change this behavior: https://lists.gnu.org/archive/html/emacs-devel/2023-05/msg00251.html In that message I also noted how request.el handles this issue, and a potential workaround.
There is absolutely some sort of issue with process handling in Emacs, at least when in batch mode. I've been developing some tools with the heavy use of make-process
but always had problems, especially with the use of many processes launched at once. I observed that sometimes when a process finishes and that process has output right at the end of execution, that output does not go through the process filter and into the process buffer before the process sentinel is called. In other words, sometimes the sentinel with the "finished\n" event is called before the final process filter call. The filter is essential for receiving process output and inserting the output into the process buffer. This happens even when no filter or sentinel is set on a process.
In my case, I found that the following code I hacked together helps:
(defun advice--make-sentinel-waiter (sentinel)
(lambda (process event)
(with-local-quit
(accept-process-output process 0.1))
(when sentinel (funcall sentinel process event))))
(defun advice--make-process (oldfun &rest args)
(let ((sentinel (plist-get args :sentinel)))
(let ((args (copy-sequence args)))
(setf args (plist-put args :sentinel
(advice--make-sentinel-waiter sentinel)))
(let (process actually-live)
(while (not actually-live)
(when process
(message "(%s) Failed to start process, retrying" (process-name process)))
(setf process (apply oldfun args)
actually-live (process-live-p process)))
process))))
(defun advice--set-process-sentinel (oldfun process sentinel)
(funcall oldfun process (advice--make-sentinel-waiter sentinel)))
(advice-add 'make-process :around #'advice--make-process)
(advice-add 'set-process-sentinel :around #'advice--set-process-sentinel)
What this does is modify make-process
and the sentinel mechanisms of it. It ensures that any sentinel that is set on the process will allow an extra 'cycle' of accept-process-output
first before being called. This allows the filter to be called on the tail output of the process and have the output handled as intended. Also, the make-process
advice will make sure that a process is actually live after a call to make-process
, though the message is never called so I don't think this is a problem, just one of the debugging methods I have employed to figure out a workaround.
This one misbehavior I have found with the solution completely solved my issues in my projects. If this helps get you at least a bit further that would be great.
@piknik Wow, thanks for sharing that. Did you report an Emacs bug with that info and code? Also, what versions of Emacs did you see that behavior on? I don't think I've seen that exact problem in my projects, so I wonder if it's been fixed since then. AFAIK Emacs 27 and/or 28 had some minor fixes in process.c
.
Although, I'm not completely sure if I haven't seen that problem. My current intuition is that, when I update plz
to no longer signal errors from the sentinel, it will solve the problem that manifests as plz
sometimes returning nil (or passing nil to the callback) instead of an intended value; but if I'm wrong about that, it might be the same problem you've described.
@alphapapa I haven't run into any issue specifically when errors are invoked or caught in a sentinel. In fact, my projects have a nested sentinel call stack, where sentinels can make other processes and handle errors from them. I believe that the C code hands off execution to elisp in the accept-process-output
call such that a call to accept-process-output
is like a call to another sentinel or filter, but that's my intuition from my observation.
I'm using GNU Emacs 29.0.91 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.17.8) of 2023-05-23
, git version, so this is definitely a persistent issue, perhaps decades old. I know that for me, I sometimes had output that was nil
where output was expected, but the workaround seems to have fixed that.
I am not familiar with the Emacs bug reporting process and have never submitted Emacs bugs. Would I be right to follow the EmacsWiki entry on Emacs Bug Tracker?
@alphapapa I haven't run into any issue specifically when errors are invoked or caught in a sentinel. In fact, my projects have a nested sentinel call stack, where sentinels can make other processes and handle errors from them.
Hm, well, IIUC that's maybe not a good idea. See my recent post to emacs-devel and Eli Zaretskii's reply: https://lists.gnu.org/archive/html/emacs-devel/2023-05/msg00251.html He says that one should definitely not signal errors from inside a sentinel, which I would presume would apply to anything called from a sentinel, i.e. the whole call stack.
I believe that the C code hands off execution to elisp in the
accept-process-output
call such that a call toaccept-process-output
is like a call to another sentinel or filter, but that's my intuition from my observation.
This is basically described in the Elisp manual section 39.10 on process sentinels. But even that has a limit on how much detail it goes into.
I'm using
GNU Emacs 29.0.91 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.17.8) of 2023-05-23
, git version, so this is definitely a persistent issue, perhaps decades old. I know that for me, I sometimes had output that wasnil
where output was expected, but the workaround seems to have fixed that.
Hm, that does sound similar to the problems I've experienced.
I am not familiar with the Emacs bug reporting process and have never submitted Emacs bugs. Would I be right to follow the EmacsWiki entry on Emacs Bug Tracker?
Maybe, but the EmacsWiki is not an official resource and may be outdated or incorrect. Basically, the thing to do is M-x report-emacs-bug RET
. Write as clear and complete a report as you can, then email it as instructed there, and wait for a reply from the maintainers. If you do, you should probably also mention the report I filed, https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50166, and also provide a link to this GitHub issue where we've discussed it. And then please link your bug report here as well. :)
Hm, well, IIUC that's maybe not a good idea. See my recent post to emacs-devel and Eli Zaretskii's reply: https://lists.gnu.org/archive/html/emacs-devel/2023-05/msg00251.html He says that one should definitely not signal errors from inside a sentinel, which I would presume would apply to anything called from a sentinel, i.e. the whole call stack.
Interesting. With my use case of Emacs primarily in batch mode, I have a custom debugger that requires debug-on-error
to be non-nil for the custom debugger to be invoked during execution of normal-top-level
. I suppose inadvertently I've made my code dependent on this behavior and would cause me problems if the sentinels were ran in a GUI, so thanks for bringing this up to me to correct my assumptions. This would mean that when we use accept-process-output
we must check for an error variable of sorts set by a sentinel after this call and signal the error when appropriate.
This behavior still doesn't explain why the sentinel sometimes runs before the final filter call. At some point I'll look into plz
and run the tests, do some poking around.
Maybe, but the EmacsWiki is not an official resource and may be outdated or incorrect. Basically, the thing to do is
M-x report-emacs-bug RET
. Write as clear and complete a report as you can, then email it as instructed there, and wait for a reply from the maintainers. If you do, you should probably also mention the report I filed, https://debbugs.gnu.org/cgi/bugreport.cgi?bug=50166, and also provide a link to this GitHub issue where we've discussed it. And then please link your bug report here as well. :)
Thanks for the info
Hey guys,
Using GNU Emacs 28.2 (build 2, x86_64-w64-mingw32) of 2022-09-13
on Windows 10
I wasn't too sure where to discuss this issue, but here seemed to be the right place as it seems to be a process sentinel issue. The error I'm getting first arose when I tried to use ement.el
When running either ement-connect
or (plz 'get "https://httpbin.org/user-agent")
I get the error:
error in process sentinel: plz: Curl error: "plz--sentinel: Curl error", #s(plz-error (2 . "Failed to initialize.") nil nil)
@kaibagley Hello,
That error means that curl didn't recognize the arguments given to it. Windows's version of "curl" is not the real curl, it's some ersatz Microsoft version. Please see: https://github.com/alphapapa/ement.el/issues/38#issuecomment-1525586128
@piknik FWIW, I've made some changes in a branch that seem to solve the problem of sometimes getting a nil return value from a plz
request. https://github.com/alphapapa/plz.el/tree/wip/no-signal-from-sentinel I didn't copy your workarounds exactly, but I did add a simple check to ensure that the sentinel had run and returned a result, and to call the sentinel again if not, and that seems to have made a big difference: https://github.com/alphapapa/plz.el/compare/wip/no-signal-from-sentinel#diff-86628899659777b5eeb393a8c5c3471f4bdbe10ed16a653a12705cee1225d07dR487-R491 Even the tests all pass now, reliably (as long as the httpbin.org server isn't overloaded; I sometimes use a local Docker container for that part).
I intend for that branch to be a future version of plz; after it's merged I'll let it mellow on master
for a while.
This seems to be fixed in v0.7 and on master
now. I'll leave this issue open, though, because of the longstanding Emacs process-handling issues, and in case they crop up again...
AFAIK this is solved now, so closing.