vhost-device icon indicating copy to clipboard operation
vhost-device copied to clipboard

Added snapshot feature for vhost-device-vsock.

Open WeiChungHsu opened this issue 9 months ago • 7 comments

Added snapshot feature for vhost-device-vsock. There is a hang issue during snapshot disable vring, my solution is to check vring enable/disable status but it requires a new function get_enabled from vring. The vring new function get_enabled PR is here: https://github.com/rust-vmm/vhost/pull/284

More detailed explain the hang issue: When the Vring is disabled (and queue set to not ready), the vhu_vsock_thread.rs still try to iterate and read RX queue, therefore, it hang at this line: https://github.com/rust-vmm/vhost-device/blob/main/vhost-device-vsock/src/vhu_vsock_thread.rs#L591-L594

Furthermore, it hang because this Queue iter() operation return error at this line: https://github.com/rust-vmm/vm-virtio/blob/main/virtio-queue/src/queue.rs#L592-L597

Requirements

Before submitting your PR, please make sure you addressed the following requirements:

  • [ ] All commits in this PR have Signed-Off-By trailers (with git commit -s), and the commit message has max 60 characters for the summary and max 75 characters for each description line.
  • [ ] All added/changed functionality has a corresponding unit/integration test.
  • [ ] All added/changed public-facing functionality has entries in the "Upcoming Release" section of CHANGELOG.md (if no such section exists, please create one).
  • [ ] Any newly added unsafe code is properly documented.

WeiChungHsu avatar Mar 17 '25 17:03 WeiChungHsu

@WeiChungHsu CI is failing, please fix it.

stefano-garzarella avatar Mar 18 '25 09:03 stefano-garzarella

The vring new function get_enabled PR is here: https://github.com/rust-vmm/vhost/pull/284

We can't merge code that depends on ongoing work, so I mark this as draft. Please also temporary change the vhost crate dep to a git repo that contains the change, so the CI can test this code. When the changes to vhost will be merged, we can remove that and mark this as ready.

stefano-garzarella avatar Mar 18 '25 09:03 stefano-garzarella

@WeiChungHsu CI is failing, please fix it.

Sorry, I am fixing and working on it now.

WeiChungHsu avatar Mar 20 '25 20:03 WeiChungHsu

Also, the commit message is all on one line. Please put a short descriptive commit title and use the commit message body for longer text.

Fixed the commit message with sign-off. Thanks!

WeiChungHsu avatar Mar 24 '25 20:03 WeiChungHsu

I fixed the most of CI failures, but the this failure is based on another PR in vhost crate: "error[E0599]: no method named get_enabled found". Therefore, I have to wait that PR merged first. The pending PR in vhost crate is: https://github.com/rust-vmm/vhost/pull/291

WeiChungHsu avatar Mar 25 '25 19:03 WeiChungHsu

I fixed the most of CI failures, but the this failure is based on another PR in vhost crate: "error[E0599]: no method named get_enabled found". Therefore, I have to wait that PR merged first. The pending PR in vhost crate is: rust-vmm/vhost#284

As I mentioned here https://github.com/rust-vmm/vhost-device/pull/827#issuecomment-2732414521 can you change Cargo.toml to point to your branch used for https://github.com/rust-vmm/vhost/pull/284 ?

In this way we can test also this PR. When that PR will be merged and a new vhost-user-backend released, we can remove that change and mark this PR ready to be merged.

stefano-garzarella avatar Apr 01 '25 15:04 stefano-garzarella

@WeiChungHsu also, instead of clicking on "Update branch", you can go to the arrow and do "Update with rebase" (maybe I should see if I can enable it by default)

stefano-garzarella avatar Apr 01 '25 15:04 stefano-garzarella