firecracker icon indicating copy to clipboard operation
firecracker copied to clipboard

Allow snapshot tap changes

Open andrewla opened this issue 1 year ago • 6 comments

Changes

Allow renaming of tap devices on snapshot restore

Reason

In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check CONTRIBUTING.md.

PR Checklist

  • [X] If a specific issue led to this PR, this PR closes the issue.
  • [X] The description of changes is clear and encompassing.
  • [x] Any required documentation changes (code and docs) are included in this PR.
  • [X] API changes follow the Runbook for Firecracker API changes.
  • [x] User-facing changes are mentioned in CHANGELOG.md.
  • [X] All added/changed functionality is tested.
  • [X] New TODOs link to an issue.
  • [X] Commits meet contribution quality standards.

  • [X] This functionality cannot be added in rust-vmm.

andrewla avatar Aug 15 '24 21:08 andrewla

Hi @andrewla thank you for your contribution! We would like to understand the use case better in case it can be resolved through other means first. We recommend using a network namespace where you can create TAP devices with the same name, but that probably requires CAP_SYS_ADMIN, which I understand is what you mean with "limited privilege environments".

Could you elaborate on your use case? Is there a way you could create the namespace in a privileged setting and then use something like nsenter firecracker ...?

pb8o avatar Aug 19 '24 10:08 pb8o

That assessment is correct -- basically to run the jailer in a network namespace you need the setns syscall which requires CAP_SYS_ADMIN. So nsenter is not an option.

Our particular case is running in a containerized environment where our privileges are limited by the nature of the general environment. Once we're in our particular container we have lost all relevant privileges.

andrewla avatar Aug 19 '24 20:08 andrewla

Hi again @andrewla, we have been talking internally about this PR and we may need to spend some time to decide on the API aspects of it to make sure it doesn't conflict with other efforts.

In the meantime, we thought of another workaround. The snapshot-editor could be enhanced to rename the tap devices in an snapshot file. That would be an easier decision for us, but we want to make sure it would handle your use case.

For example we imagine the tool would work like this:

snapshot-editor edit-vmstate rename-network eth0 tap1

Would this work within your environment?

pb8o avatar Aug 29 '24 14:08 pb8o

This was our initial approach as it required minimal changes. But we found that the performance cost of making the copy (as opposed to hardlinking) during the operation (plus serde costs) were more expensive than we were willing to tolerate in our environment.

andrewla avatar Sep 03 '24 17:09 andrewla

Hi @pb8o -- is there anything we can do to help move this forward?

andrewla avatar Oct 10 '24 19:10 andrewla

Hi @andrewla I haven't had time to look at this, but this is next on my list now. Thanks for your patience!

pb8o avatar Oct 16 '24 09:10 pb8o

On a related note, another reason why renaming the tap device is a better approach than namespaced NAT from the "Network for Clones" guide is that the namespaced NAT imposes measurable overhead onto the host kernel due to the addition of about 5 more iptables/nft rules, plus an RTNETLINK route for forwarding the guest IP out of the netns.

Even though I made an effort to support namespaced NAT in fcnet, it increased complexity by a factor of 4-5x in comparison to regular NAT only to support one usecase: two simultaneous microVM clones. So I'd be in favor of this change, or a snapshot-editor equivalent.

kanpov avatar Nov 01 '24 19:11 kanpov

Hello @andrewla ! I apologize for the long time between updates, but some other stuff came up. So we have decided to go ahead with this. I gave a first initial review and I only have some minor comments, but mostly looks good to me. I just have a question if the network_overrides field also works when starting from a JSON config file.

pb8o avatar Jan 08 '25 17:01 pb8o

Re: config -- currently there is no config support for snapshots (https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/resources.rs) -- the snapshot configuration and restore has to be done with a running firecracker instance

andrewla avatar Jan 09 '25 17:01 andrewla

Codecov Report

Attention: Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.

Project coverage is 83.14%. Comparing base (4e9b215) to head (adf9d4a). Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/vmm/src/persist.rs 15.38% 11 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4731      +/-   ##
==========================================
- Coverage   83.18%   83.14%   -0.04%     
==========================================
  Files         248      248              
  Lines       26910    26923      +13     
==========================================
+ Hits        22384    22386       +2     
- Misses       4526     4537      +11     
Flag Coverage Δ
5.10-c5n.metal 83.53% <21.42%> (-0.04%) :arrow_down:
5.10-m5n.metal 83.51% <21.42%> (-0.04%) :arrow_down:
5.10-m6a.metal 82.71% <21.42%> (-0.04%) :arrow_down:
5.10-m6g.metal 79.56% <21.42%> (-0.04%) :arrow_down:
5.10-m6i.metal 83.51% <21.42%> (-0.04%) :arrow_down:
5.10-m7g.metal 79.56% <21.42%> (-0.04%) :arrow_down:
6.1-c5n.metal 83.58% <21.42%> (-0.03%) :arrow_down:
6.1-m5n.metal 83.56% <21.42%> (-0.03%) :arrow_down:
6.1-m6a.metal 82.75% <21.42%> (-0.04%) :arrow_down:
6.1-m6g.metal 79.56% <21.42%> (-0.04%) :arrow_down:
6.1-m6i.metal 83.55% <21.42%> (-0.05%) :arrow_down:
6.1-m7g.metal 79.56% <21.42%> (-0.04%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jan 15 '25 09:01 codecov[bot]

It turns out that the test for renaming devices was failing when run with other tests that used network devices. After some experimentation, it seems that we are not cleaning up network devices from other tests, and modifying a network device results in an incompatible network configuration, rendering the VM unreachable.

For now I've patched this by having the new test use an unallocated network device, but I'm not sure if we're comfortable with this or if we want to try to figure out why the test passes when run alone but not when run in tandem with other tests.

andrewla avatar Jan 21 '25 22:01 andrewla

I have applied the changes suggested by @pb8o. Also, I squashed all test commits to a single commit and I moved some code around in the appropriate commits.

bchalios avatar Mar 13 '25 11:03 bchalios