kiwix-tools icon indicating copy to clipboard operation
kiwix-tools copied to clipboard

kiwix-serve crashes on mount/remount

Open benjaoming opened this issue 6 years ago • 7 comments

Ways to reproduce:

Point kiwix-serve to an external device. Unplug and plug back in. kiwix-serve is now dead.

Expected / better behavior:

kiwix-serve tells the end-user (in the browser) that it's unavailable until the library location is available again and then resumes.

Cases for this:

  • Someone unplugs a cable by mistake
  • Power outage leaves a battery driven device (laptop) on but external USB storage off
  • System starts kiwix-serve on boot before the served location is ready.

benjaoming avatar Oct 30 '19 15:10 benjaoming

Here is the crash dump I had.

Sep 26 11:04:16 fair-server systemd[1]: kiwix.service: Control process exited, code=exited status=1
Sep 26 11:04:16 fair-server systemd[1]: kiwix.service: Failed with result 'exit-code'.
Sep 26 11:15:37 fair-server kiwix[12453]: terminate called after throwing an instance of 'std::ios_base::failure'
Sep 26 11:15:37 fair-server kiwix[12453]:   what():  Cannot read char.

benjaoming avatar Jan 06 '20 19:01 benjaoming

@benjaoming I guess you understand what is going on behind the scene. I'm not sure handling this use case properly is so trivial.

kelson42 avatar Jan 06 '20 19:01 kelson42

@kelson42 I'm sure it's not trivial, nothing like that implied.

Suggesting to try handling appropriate I/O errors (as much as possible, maybe just start with the most common errors) and displaying a "library not available" error message in kiwix-serve.

benjaoming avatar Jan 06 '20 19:01 benjaoming

@benjaoming I tried with latest 3.2.0-1 and GNU/Linux and actually for me, now, it does not crash (anymore?). Somehow the kiwix-serve is stuck and load without any end.

kelson42 avatar Feb 03 '22 16:02 kelson42

@mgautierfr @veloman-yunkan To me, this is still not the proper behaviour. Proper behaviour would be to return 404 and remove book from internal library.

kelson42 avatar Jan 31 '23 06:01 kelson42

I don't think there's valid answer for this but we have to make a choice. Here's my opinion:

  • I/O errors can be temporary and/or can affect only part of a ZIM file. Therefor, removing a book from the library on an I/O error seems too strict. It doesn't give any chance at coming back.
  • An expected book that fails to read should raise a 500 error, not a 404. If the book gets removed from the library, subsequent responses will obviously be 404.
  • I wonder how --monitorLibrary behaves in this case (assuming the library file itself is on the disappearing fs) but it kiwix-serve it might be interesting for weak mount points:
    • library disapears, kiwix-serve is notified and empties library (might be interesting to have the library datetime in the home UI btw)
    • library file comes back, kiwix-serve is notified and re-reads the library

rgaudin avatar Jan 31 '23 10:01 rgaudin

Recovering from such a system error is pretty complex. All fd are by definition invalidated and return io error. Plugin back the usb drive will not revalidate the fd magically, kiwix-serve (and any other application) will have to close the file and reopen it.

  • The first step would be to correctly detect the io error
  • In case of io error, return a 500 and remove the book from the internal library cache (as if the book was never opened)
  • On request, if the file is not in the cache, we try to open it anyway, so if the usb drive is plugged back we should correctly open the file.

I don't think we should remove the book from the library at all. It is not to kiwix-serve to modify the input libary. And we already have a filtering at kiwix-serve start up to filter book with invalid path.

mgautierfr avatar Mar 08 '23 15:03 mgautierfr