gvisor icon indicating copy to clipboard operation
gvisor copied to clipboard

Checkpoint & Restore between machines with different CPU features

Open tianyuzhou95 opened this issue 9 months ago • 2 comments

Description

Currently, a checkpoint image created on a physical machine with a newer CPU may encounter restoration failures due to missing CPU flags when restored on a physical machine with an older CPU[1].

This has increased the complexity of using Checkpoint/Restore technology to accelerate container startup (one image, multiple containers). We either have to find a machine(or choose a vm) that has a feature set as the maximum subset to create the checkpoint image, or we must create separate checkpoint images for each type of machine and distribute them according to the machine type.

Additionally, you may observe in application logs your Function being memory snapshots multiple times during its first few invocations. This happens because memory snapshots are compatible with the underlying worker type that created them, and Modal Functions run across a handful of worker types.

Modal has encountered similar issues, which has led them to create multiple images[2].

  1. https://github.com/google/gvisor/blob/release-20250217.0/pkg/sentry/kernel/kernel.go#L800
  2. https://modal.com/docs/guide/memory-snapshot

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

Thanks to the capability of gVisor's cpuid emulation, we can control the CPU features exposed to the user application (i.e., the maximum feature subset of all CPUs in the cluster), which allows us to create only one checkpoint image. This has been widely used internally, and we hope to merge this feature into the mainline.

Currently, we use an annotation dev.gvisor.internal.cpufeatures inside config.json to pass the CPU features exposed to the user application, and we also hope the gVisor community can give some input to see what approach would be more general.

tianyuzhou95 avatar Feb 21 '25 05:02 tianyuzhou95

Yeah we need to add support CPU feature leveling - i.e. allow users to specify the CPU feature set the application can use. If runsc users want to checkpoint/restore across hosts, then they would define that as the lowest common denominator between the two CPUs.

PRs are much appreciated! What you have described seems acceptable. The annotation approach is good.

ayushr2 avatar Feb 25 '25 17:02 ayushr2

PR https://github.com/google/gvisor/pull/11498 has been submitted; please review it when you have time :)

tianyuzhou95 avatar Feb 28 '25 08:02 tianyuzhou95

@tianyuzhou95 is this good to close?

ayushr2 avatar Oct 08 '25 19:10 ayushr2

@tianyuzhou95 is this good to close?

Sure :)

tianyuzhou95 avatar Oct 08 '25 23:10 tianyuzhou95