criu
criu copied to clipboard
Dump a process without running first
ciru dump -- <command>, my process crashes immediately so it's not feasible to trace it with pid and potentially lose vital information at the startup.
Can you elaborate your problem in more details ?
ciru can dump the running process using pid. ciru dump <PID>, which is similar to gdb -p <PID>
i am looking for the other option as well where ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin to gdb -- <command and its args>.
my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.
Hi! This issue sounds interesting, and I'd love to work on it. Could you assign this issue to me? Thanks!
ciru can dump the running process using pid.
ciru dump <PID>, which is similar togdb -p <PID>I am looking for the other option as well, where Ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin to
gdb -- <command and its args>.my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.
I think the reason CRIU can’t behave exactly like gdb -p <PID> --arg1 --arg2 etc is because CRIU isn’t a debugger — it’s designed to checkpoint and restore running processes, not control or monitor their execution state in real time. gdb attaches and intercepts signals like segfaults and exceptions, but CRIU works by taking snapshots of a live process and its memory, files, and state. That’s why the current behavior requires a running process and a PID — it needs an active target to dump.
That said, we can solve the problem of capturing early crashes without missing startup behavior by adding a new feature! Here’s an idea:
Proposed Solution: We add a feature like criu run, which would:
Start the user’s program directly.
We can provide the args if needed
Monitor its execution.
We could do incremental dumps by setting an interval, so we always have the most recent state before a crash. That way, if the program dies, we don’t miss important info even if the crash happens early.
Example Command:
criu run ./myapp --args -D /tmp/checkpoints
Starts ./myapp with its arguments.
Dumps its state from start till it crashes (segfault, non-zero exit).
So we always capture the latest state before failure.
If someone could assign this issue to me, I could build this or if anyone has any other idea, we can discuss that. I am planning to contribute to CRIU through GSOC 2025 also, it would be helpful if I could start contributing with this issue.
ciru can dump the running process using pid.
ciru dump <PID>, which is similar togdb -p <PID>I am looking for the other option as well, where Ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin togdb -- <command and its args>. my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.I think the reason CRIU can’t behave exactly like gdb -p --arg1 --arg2 etc is because CRIU isn’t a debugger — it’s designed to checkpoint and restore running processes, not control or monitor their execution state in real time. gdb attaches and intercepts signals like segfaults and exceptions, but CRIU works by taking snapshots of a live process and its memory, files, and state. That’s why the current behavior requires a running process and a PID — it needs an active target to dump.
That said, we can solve the problem of capturing early crashes without missing startup behavior by adding a new feature! Here’s an idea:
Proposed Solution: We add a feature like criu run, which would:
Start the user’s program directly. We can provide the args if needed Monitor its execution. We could do incremental dumps by setting an interval, so we always have the most recent state before a crash. That way, if the program dies, we don’t miss important info even if the crash happens early.Example Command:
criu run ./myapp --args -D /tmp/checkpoints
Starts ./myapp with its arguments. Dumps its state from start till it crashes (segfault, non-zero exit). So we always capture the latest state before failure.If someone could assign this issue to me, I could build this or if anyone has any other idea, we can discuss that. I am planning to contribute to CRIU through GSOC 2025 also, it would be helpful if I could start contributing with this issue.
another approach would be
Start the process through CRIU: CRIU runs the program so it can monitor it from the start.
Intercept signals: Use something like ptrace() (like gdb does) or signalfd() to catch crash signals like SIGSEGV, SIGBUS, SIGILL, etc.
Pause the process on signal: As soon as the signal arrives, freeze the process before any cleanup happens.
Dump state immediately: Use CRIU’s dump functionality to capture memory, open files, and register state.
@kasperk81 If your process creates a core file on crash you should have all the information you need. What are you looking for outside of the information of the core file?
hey, i used this as an alternative to gdb’s missing time travel debugging for multi-threaded processes. when my process crashes early, i can walk the stack backwards to pinpoint the issue.
But this means you want multiple checkpoints, right? At which interval? For every instruction executed? To me it sounds much more complicated than how you describe it. One single checkpoint does not really help you if I understand you correctly.
my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.
i used this as an alternative to gdb’s missing time travel debugging for multi-threaded processes.
I agree with Adrian; CRIU is not intended to be used as a drop-in replacement for gdb.
Using tools like strace can help you to understand why your application crashes.
A friendly reminder that this issue had no activity for 30 days.