uhyve icon indicating copy to clipboard operation
uhyve copied to clipboard

Prevent hypercall handling from panicking

Open jounathaen opened this issue 1 year ago • 4 comments

A panic in the hypercall causes uhyve to hang indefinitely, as only the specific CPU thread will panic, and all other threads are still waiting for the barrier in https://github.com/hermit-os/uhyve/blob/360b613ee7498d0a1b651399f27fb6f5460497ad/src/linux/mod.rs#L137 to pass, which never happens as the thread that could activate it has paniced. We should remove any unwrap() or expect() in the code that is called by run.

jounathaen avatar Jan 31 '24 16:01 jounathaen

Hello!

My partners and I are a team of three students currently studying Virtualization at the University of Texas at Austin. Part of our coursework is contributing to open-source repos like this. We are wondering if this issue is still open and available for contribution? If so, we would love to take a chance to work on this because it seems very interesting to us! If approved, we may be contributing to some other issues as well.

Thank you so much!

charlottestinson avatar Apr 05 '24 18:04 charlottestinson

Sure, this would be much appreciated. If you need any help/advice, just post here. :slightly_smiling_face:

jounathaen avatar Apr 07 '24 09:04 jounathaen

We should remove any unwrap() or expect() in the code that is called by run.

While that would be nice, I don't think we can and should remove all possible panics from the vCPUs comprehensively. I think it would be better to catch_unwind at a high level to then exit the application to resolve this issue. Better error handling is always good, of course, but that does not always mean less panicking, in my opinion.

mkroening avatar Apr 07 '24 09:04 mkroening

True, but I think (without deeper looking into it) we actually have three cases here:

  • Simple to fix unwraps that can be mitigated
  • Errors in the hypervisor, which probably would be solved best by catch_unwind
  • Errors resulting from erroneous hypercalls, that should be reported back to the kernel to be handled there. (this likely requires changes in uhyve-interface and the kernel)

So I think if any of these cases is at least partially tackled by @charlottestinson, it would be an improvement.

jounathaen avatar Apr 07 '24 13:04 jounathaen