Checkpoint of chrome fails: pagemap-cache: Can't read 9397's pagemap file: No such file or directory
Description
Taking checkpoint of chrome fails with the following error:
(01.256562) pagemap-cache: 9397: filling VMA 738000c4000-83d00000000 (1094712560K) [l:73800000000 h:73800200000]
(02.325156) Error (criu/pagemap-cache.c:209): pagemap-cache: Can't read 9397's pagemap file: No such file or directory
(02.325194) Error (criu/pagemap-cache.c:225): pagemap-cache: Failed to fill cache for 9397 (738000c4000-83d00000000)
(02.325245) page-pipe: Killing page pipe
(02.416491) ----------------------------------------
(02.416520) Error (criu/mem.c:672): Can't dump page with parasite
filling VMA 738000c4000-83d00000000 (1094712560K) - 1094712560K sounds too big?
Steps to reproduce the issue:
The dump happens inside a Kubernetes pod and the image is proprietary. I can create a new image if it turns out the problem is not much straight-forward and requires full reproduction to debug.
I used the following command to dump:
$ criu dump -t 9289 -D /checkpoint --tcp-established --tcp-close -v4 --log-file dump.log --root / --manage-cgroups=ignore --ghost-limit=500M
The process tree was like the following:
$ ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1233680 8040 ? Ssl 20:32 0:00 procman run --image-dir /checkpoint --port 27898 -- /opt/scripts/entrypoint.sh start
root 9289 0.0 0.0 4784 3440 ? Ss 20:32 0:00 /bin/bash /opt/scripts/entrypoint.sh start
root 9295 0.3 0.1 53700 34560 ? S 20:32 0:00 /usr/bin/python3 /usr/bin/websockify --web /usr/share/novnc 5800 localhost:5900
root 9297 0.5 0.2 235212 84596 ? S 20:32 0:00 Xtigervnc -geometry 1288x804x24 -SendPrimary=0 -SecurityTypes None -rfbport 5900 -alwaysshared -useblacklist=0 -Log
root 9298 41.0 1.8 1262088 612648 ? Sl 20:32 0:13 node ./src/index.js
root 9311 0.2 0.0 54196 27188 ? S 20:32 0:00 /usr/bin/python3 /usr/bin/websockify --web /usr/share/novnc 5800 localhost:5900
root 9312 8.7 0.7 34369604 261672 ? S<sl 20:32 0:02 /opt/google/chrome/chrome --disable-field-trial-config --disable-background-networking --enable-features=NetworkSer
root 9314 0.0 0.0 33575872 3448 ? Sl 20:32 0:00 /opt/google/chrome/chrome_crashpad_handler --monitor-self --monitor-self-annotation=ptype=crashpad-handler --databa
root 9316 0.0 0.0 33567660 1708 ? Sl 20:32 0:00 /opt/google/chrome/chrome_crashpad_handler --no-periodic-tasks --monitor-self-annotation=ptype=crashpad-handler --d
root 9321 0.0 0.1 33916516 55164 ? S 20:32 0:00 /opt/google/chrome/chrome --type=zygote --no-zygote-sandbox --no-sandbox --crashpad-handler-pid=9314 --enable-crash
root 9322 0.0 0.1 33916516 55944 ? S 20:32 0:00 /opt/google/chrome/chrome --type=zygote --no-sandbox --crashpad-handler-pid=9314 --enable-crash-reporter=, --user-d
root 9341 2.0 0.4 34241296 134356 ? Sl 20:32 0:00 /opt/google/chrome/chrome --type=gpu-process --no-sandbox --disable-dev-shm-usage --disable-breakpad --crashpad-han
root 9343 3.0 0.3 33938240 112528 ? Sl 20:32 0:00 /opt/google/chrome/chrome --type=utility --utility-sub-type=network.mojom.NetworkService --lang=en-US --service-san
root 9344 0.0 0.1 33966992 47864 ? Sl 20:32 0:00 /opt/google/chrome/chrome --type=utility --utility-sub-type=storage.mojom.StorageService --lang=en-US --service-san
root 9397 21.4 0.9 1186275764 327236 ? R<l 20:32 0:05 /opt/google/chrome/chrome --type=renderer --crashpad-handler-pid=9314 --enable-crash-reporter=, --user-data-dir=/tm
root 9420 8.2 0.7 1186268920 232180 ? S<l 20:32 0:01 /opt/google/chrome/chrome --type=renderer --crashpad-handler-pid=9314 --enable-crash-reporter=, --user-data-dir=/tm
root 9431 2.5 0.5 1186242008 179920 ? S<l 20:32 0:00 /opt/google/chrome/chrome --type=renderer --crashpad-handler-pid=9314 --enable-crash-reporter=, --user-data-dir=/tm
root 9447 0.1 0.2 33891548 68764 ? Sl 20:32 0:00 /opt/google/chrome/chrome --type=utility --utility-sub-type=audio.mojom.AudioService --lang=en-US --service-sandbox
root 9455 0.0 0.1 1186193528 55604 ? S<l 20:32 0:00 /opt/google/chrome/chrome --type=renderer --crashpad-handler-pid=9314 --enable-crash-reporter=, --user-data-dir=/tm
crit commands like crit x . fds also fail because dump is not complete.
Describe the results you received:
Got error during dump:
(02.325156) Error (criu/pagemap-cache.c:209): pagemap-cache: Can't read 9397's pagemap file: No such file or directory
Describe the results you expected:
Expected success.
Additional information you deem important (e.g. issue happens only occasionally):
Happens consistently.
CRIU logs and information:
Output of `criu --version`:
Version: 3.19 (gitid 0)
Output of `criu check --all`:
Warn (criu/kerndat.c:1285): Can't keep kdat cache on non-tempfs
Error (criu/cr-check.c:1223): UFFD is not supported
Error (criu/cr-check.c:1223): UFFD is not supported
Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.
Additional environment details:
It's running inside a Kubernetes pod where container runtime is containerd and node arch is amd64.
@muvaf Any luck here? Running into the same issue
@muvaf could you show /proc/pid/maps for the target process?
I think it hits the MAX_RW_COUNT (0x7ffff000) limit. The length of the target vma is 0x104fff3c000. CRIU reads 8 bytes per page, so it is 0x827ff9e0 bytes.
@muvaf could you try out https://github.com/avagin/criu/commit/9405da090c93ef100e3fb0c7da2646cdb9e27fc1? It should fix the problem.
By the way, for such large dummy mappings, the pagemap file interface works slowly. Recently, the new PAGEMAP_SCAN ioctl was merged into the mainline kernel, and its support was implemented in CRIU (https://github.com/checkpoint-restore/criu/pull/2292). With these changes, CRIU handles huge dummy mappings much faster.
@avagin That patch did make the error go away and I was able to take the checkpoint. Thank you! However, I wasn't able to validate that the checkpoint is correctly taken. The restore command fails with the following even though --tcp-close is used:
criu restore --images-dir /checkpoint --tcp-established --file-locks --evasive-devices --tcp-close --manage-cgroups=ignore -v4 --log-file restore.log --inherit-fd fd[1]:pipe:[1687037] --inherit-fd fd[2]:pipe:[1687038] --external mnt[zoneinfo]:/usr/share/zoneinfo --external mnt[null]:/dev/null --external mnt[random]:/dev/random --external mnt[urandom]:/dev/urandom --external mnt[tty]:/dev/tty --external mnt[zero]:/dev/zero --external mnt[full]:/dev/full
Error (criu/sk-inet.c:1029): inet: Can't bind inet socket (id 778): Cannot assign requested address
@lukejmann I needed to add --file-locks and make sure some of the folders Chrome creates under /tmp is available on the target as well. To build @avagin 's patch, you need to clone, run make docker-build and copy /criu/criu/criu file from that image, given the dependencies listed here are in place where you run the commands.
@muvaf Well, they are udp sockets :)
(00.289251) 9344: inet: Restore: family AF_INET type SOCK_DGRAM proto IPPROTO_UDP port 56915 state TCP_ESTABLISHED src_addr 10.140.3.135
It is a good question what should we do with them... @muvaf What behavior do you expect?
@avagin Huh I didn't realize that. I think, for starters, having an --udp-close flag similar to TCP would unlock part of the use cases where the both sides of the socket are designed to be resilient to reconnections.
To go further may not be feasible due to the same issue TCP has in regards to change of IP addresses, so at least we'd give users an escape hatch if they really have to change the IP address.
@avagin FWIW, if you can give me a pointer, I can try to get a PR going to add the --udp-close flag.
@muvaf I am skeptical about the idea of "--udp-close." There is a significant difference between TCP and UDP. TCP is connection-oriented, and the situation where a connection is interrupted is entirely normal and must be handled in the code. UDP, on the other hand, is connectionless. Therefore, applications may be caught off guard if "send" or "recv" return errors.
You can try out the next patch to see how your workload will handle closed udp sockets after restore:
diff --git a/criu/sk-inet.c b/criu/sk-inet.c
index a6a767c73..eda08f971 100644
--- a/criu/sk-inet.c
+++ b/criu/sk-inet.c
@@ -901,6 +901,13 @@ static int open_inet_sk(struct file_desc *d, int *new_fd)
goto done;
}
+ if (ie->proto == IPPROTO_UDP) {
+ if (shutdown(fd, SHUT_RDWR) && errno != ENOTCONN) {
+ pr_perror("Unable to shutdown the socket id %x ino %x", ii->ie->id, ii->ie->ino);
+ }
+ goto done2;
+ }
+
if (ie->src_port) {
if (inet_bind(sk, ii))
goto err;
@@ -952,7 +959,7 @@ done:
}
}
}
-
+done2:
*new_fd = sk;
return 1;
For connected UDP sockets, it might be a good idea to skip binding to the local address. When CRIU calls "connect" to restore the destination address and port, the socket will be bound to the source address and a "random" port. I believe this should work in many cases. Could you please try the next patch, which implements this behavior?
diff --git a/criu/sk-inet.c b/criu/sk-inet.c
index a6a767c73..9bb1d04d4 100644
--- a/criu/sk-inet.c
+++ b/criu/sk-inet.c
@@ -900,8 +900,7 @@ static int open_inet_sk(struct file_desc *d, int *new_fd)
goto done;
}
-
- if (ie->src_port) {
+ if (ie->proto != IPPROTO_UDP && ie->src_port) {
if (inet_bind(sk, ii))
goto err;
}
A friendly reminder that this issue had no activity for 30 days.