zui icon indicating copy to clipboard operation
zui copied to clipboard

Debug & Restart a Crashed Zed Lake

Open jameskerr opened this issue 1 year ago • 3 comments

This issue was originally reported by a community user in a Slack thread. They were running Zui Insiders release 0.30.1-21.

When the app triggers a bug in the zed lake backend, it sometimes crashes. It would be nice to show that it crashed in the interface, and provide way to restart it without restarting the app.

It could also automatically sense that it crashed, and restart it for you.

When it crashes, we'll need to show what the error was in the app. Maybe even a link to the log file on the file system to browse in another app.

jameskerr avatar Jul 19 '22 17:07 jameskerr

It would need a circuit breaker to count how many times it's tried to start up so that we don't try to restart forever.

jameskerr avatar Jul 19 '22 17:07 jameskerr

Note that even before we start presenting the crash/log info nicely in the app, I've noticed that currently even if the user knows where to find the zlake.log for the zed serve process that was launched by the app, the panic details aren't showing up there.

This is shown in the attached video, which I reproduced with:

  • The app the reporting user was running (Zui Insiders 0.30.1-21),
  • The ZNG zed-sample-data, and
  • The user's query that triggered the crash: summarize allClientIPCount := dcount(id.orig_h).

https://user-images.githubusercontent.com/5934157/179841930-3f337651-1701-4f83-84cb-2f17893b1a83.mp4

philrz avatar Jul 19 '22 20:07 philrz

Note that https://github.com/brimdata/zed/issues/4080 was another incident where an unexpected zed serve failure made for a bad debug UX due to lack of detail in the logs.

philrz avatar Sep 09 '22 22:09 philrz