Restoring PACG on older ARM64 CPU hangs
I see occasional failures restoring a set of processes in podman.
The symptom is a timeout. Some debugging shows that it is hanging here:
(gdb) bt
#0 0x0000ffff95607be4 in syscall () from /lib/aarch64-linux-gnu/libc.so.6
#1 0x0000aaaae8a1fc64 in sys_futex (addr2=0x0, val3=0, timeout=0xffffc4786258, val1=<optimized out>, op=0,
addr1=0xffff9595b00c) at include/common/lock.h:29
#2 __restore_wait_inprogress_tasks (participants=participants@entry=0) at criu/cr-restore.c:182
#3 0x0000aaaae8a21078 in restore_wait_inprogress_tasks () at criu/cr-restore.c:194
#4 restore_switch_stage (next_stage=5) at criu/cr-restore.c:224
#5 restore_root_task (init=<optimized out>) at criu/cr-restore.c:2213
#6 0x0000aaaae8a220fc in cr_restore_tasks () at criu/cr-restore.c:2417
#7 0x0000aaaae8a27554 in restore_using_req (req=<optimized out>, sk=3) at criu/cr-service.c:889
#8 cr_service_work (sk=3) at criu/cr-service.c:1365
#9 0x0000aaaae89f5f3c in main (argc=3, argv=0xffffc4786758, envp=<optimized out>) at criu/crtools.c:191
(gdb) up
#1 0x0000aaaae8a1fc64 in sys_futex (addr2=0x0, val3=0, timeout=0xffffc4786258, val1=<optimized out>, op=0,
addr1=0xffff9595b00c) at include/common/lock.h:29
29 include/common/lock.h: No such file or directory.
(gdb)
#2 __restore_wait_inprogress_tasks (participants=participants@entry=0) at criu/cr-restore.c:182
182 criu/cr-restore.c: No such file or directory.
(gdb) p task_entries->nr_in_progress
Cannot access memory at address 0xaaaae8b5d1b0
(gdb) p &task_entries->nr_in_progress
Cannot access memory at address 0xaaaae8b5d1b0
the last lines in the restore.log are
(05.342893) pie: 134: restoring lsm profile (current) changeprofile containers-default-engflow
(05.343043) pie: 132: seccomp: Restored mode 2 on tid 132
(05.343086) pie: 132: restoring lsm profile (current) changeprofile containers-default-engflow
(I changed the profile name from its default.)
This happens occasionally on AWS ARM64 machines. We're running a set of machine types, the machine that has the above hang was a c6gd.2xlarge, cpuinfo
processor : 0
BogoMIPS : 243.75
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x3
CPU part : 0xd0c
CPU revision : 1
the problem is machine specific: the exact same snapshot restores correctly on a different machine, but on the affected machine, the hang reproduces.
I am using locally modified version of
commit c61329b30387aa50634e794a4781dde64cb2a6c3
Author: Radostin Stoyanov <[email protected]>
Date: Sun May 11 11:33:29 2025 +0100
seize: fix pause devices for frozen containers
(the mod is a minor tweak to symlink the lazy pages socket and is unaffected). The same version has been working reliably on x64.
It seems to relate to machine type; The machine type that seems to work is c7gd. They are "AWS Graviton2" (broken) and "AWS Graviton3" (working).
We have another similar issue: https://github.com/checkpoint-restore/criu/issues/2720
@hanwen-flow could you attach the full log? If you see restore_wait_inprogress_tasks in the backtrace, it means one of restored tasks hansn't complete the restore process. Could you try to look at child processes?
We have another similar issue: #2720
Actually, issue #2720 is not like this one. However, since commit c61329b30387aa50634e794a4781dde64cb2a6c3, there have been a few ARM fixes that might be related to this issue: 64276874d89825452baee6c756046e1277a41c48 restore: flush caches during restore 95d5e2e59b1b83ba5400e7eea6db57f77424fb80 compel: flush caches after parasite injection dcee5bd6ff2d632bd4e1d4d09d2ffb2bf683d6a2 make: Disable branch-protection for PIE code on ARM64
I looked at the changes, but they looked like they had different symptoms. But yes, I can upgrade and see if it helps.
Could you try to look at child processes?
What should I be looking for?
(05.176328) Error (criu/arch/aarch64/crtools.c:285): PACG support is required from the source system.
The issue is that a dumped process utilized Pointer Authentication Code (PAC) CPU extension (specifically PACG) that were enabled on the source platform. The target platform lacks support for this extension. This should be a fatal error, but CRIU did not abort the restore process.
interesting, so I guess we created the snapshot on graviton 3 and restoring on graviton 2 failed. That makes sense, because we saw other cases where graviton 2 worked (that must've been snapshots created on the same platform.)
Can similar problems occur on x86?
The binaries involved are identical across platforms, so this is a runtime decision. Do you know of a generic way to restrict features that affect CRIU operation?
Can similar problems occur on x86?
@hanwen-flow PAC is an AArch64 architecture feature. The error "PACG support is required" was introduced with https://github.com/checkpoint-restore/criu/pull/2609 and indicates that PAC was used during checkpointing.
The binaries involved are identical across platforms, so this is a runtime decision. Do you know of a generic way to restrict features that affect CRIU operation?
There are some compiler options that can be used to disable branch protection: https://developer.arm.com/documentation/109576/0100/Tools-and-software-support/Compiler-options
Can similar problems occur on x86?
I would say yes. If you have code that runtime detects certain features and uses instructions that the destination CPU does not have. You cannot really migrate to an older CPU on any architecture if some features the code uses are missing. The same problem kind of also exists with VMs. If you limit your VM to not use all features of the host CPU it can be migrated to older CPUs. Not sure how disable newer CPU features in a process. There might be some setting, depending, on the application to not use all of the latest CPU features.
Can similar problems occur on x86? The binaries involved are identical across platforms, so this is a runtime decision. Do you know of a generic way to restrict features that affect CRIU operation?
@hanwen-flow Yes, this can happen on x86, but it has become less critical in the last few years because no new features of this type have been added. We are very close to the moment when the shadow stack will be enabled by default, and this question will be raised again on x86 as well. As for solutions, we've discussed this problem many times, and some of our users have out-of-tree solutions. OpenVZ had custom changes in their kernel. Google solved this problem in their libraries. However, no one has yet suggested a valuable upstream solution. We always considered filtering CPUID and adjusting all related kernel mechanisms. That approach looks too intrusive. Yesterday, I started thinking that we can introduce the ability to mask some features from AT_HWCAP vectors. This is a much simpler feature and should work for most users. Here is my draft implementation: https://github.com/avagin/linux-task-diag/commit/ca32ef4c5edee82f4f06f98d6760d1a58c0af345
A friendly reminder that this issue had no activity for 30 days.