criu icon indicating copy to clipboard operation
criu copied to clipboard

Dump a process without running first

Open kasperk81 opened this issue 8 months ago • 10 comments
trafficstars

ciru dump -- <command>, my process crashes immediately so it's not feasible to trace it with pid and potentially lose vital information at the startup.

kasperk81 avatar Feb 28 '25 16:02 kasperk81

Can you elaborate your problem in more details ?

ankushT369 avatar Mar 01 '25 07:03 ankushT369

ciru can dump the running process using pid. ciru dump <PID>, which is similar to gdb -p <PID>

i am looking for the other option as well where ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin to gdb -- <command and its args>.

my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.

kasperk81 avatar Mar 01 '25 08:03 kasperk81

Hi! This issue sounds interesting, and I'd love to work on it. Could you assign this issue to me? Thanks!

bhavishya72005 avatar Mar 05 '25 13:03 bhavishya72005

ciru can dump the running process using pid. ciru dump <PID>, which is similar to gdb -p <PID>

I am looking for the other option as well, where Ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin to gdb -- <command and its args>.

my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.

I think the reason CRIU can’t behave exactly like gdb -p <PID> --arg1 --arg2 etc is because CRIU isn’t a debugger — it’s designed to checkpoint and restore running processes, not control or monitor their execution state in real time. gdb attaches and intercepts signals like segfaults and exceptions, but CRIU works by taking snapshots of a live process and its memory, files, and state. That’s why the current behavior requires a running process and a PID — it needs an active target to dump.

That said, we can solve the problem of capturing early crashes without missing startup behavior by adding a new feature! Here’s an idea:

Proposed Solution: We add a feature like criu run, which would:

Start the user’s program directly.
We can provide the args if needed
Monitor its execution.
We could do incremental dumps by setting an interval, so we always have the most recent state before a crash. That way, if the program dies, we don’t miss important info even if the crash happens early.

Example Command:

criu run ./myapp --args -D /tmp/checkpoints

Starts ./myapp with its arguments.
Dumps its state from start till it crashes (segfault, non-zero exit).
So we always capture the latest state before failure.

If someone could assign this issue to me, I could build this or if anyone has any other idea, we can discuss that. I am planning to contribute to CRIU through GSOC 2025 also, it would be helpful if I could start contributing with this issue.

murshidkc avatar Mar 09 '25 11:03 murshidkc

ciru can dump the running process using pid. ciru dump <PID>, which is similar to gdb -p <PID> I am looking for the other option as well, where Ciru would run the user command by itself and wait until it crashes before collecting the dump, so it doesn't miss anything at the app startup in the dumps, akin to gdb -- <command and its args>. my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.

I think the reason CRIU can’t behave exactly like gdb -p --arg1 --arg2 etc is because CRIU isn’t a debugger — it’s designed to checkpoint and restore running processes, not control or monitor their execution state in real time. gdb attaches and intercepts signals like segfaults and exceptions, but CRIU works by taking snapshots of a live process and its memory, files, and state. That’s why the current behavior requires a running process and a PID — it needs an active target to dump.

That said, we can solve the problem of capturing early crashes without missing startup behavior by adding a new feature! Here’s an idea:

Proposed Solution: We add a feature like criu run, which would:

Start the user’s program directly.
We can provide the args if needed
Monitor its execution.
We could do incremental dumps by setting an interval, so we always have the most recent state before a crash. That way, if the program dies, we don’t miss important info even if the crash happens early.

Example Command:

criu run ./myapp --args -D /tmp/checkpoints

Starts ./myapp with its arguments.
Dumps its state from start till it crashes (segfault, non-zero exit).
So we always capture the latest state before failure.

If someone could assign this issue to me, I could build this or if anyone has any other idea, we can discuss that. I am planning to contribute to CRIU through GSOC 2025 also, it would be helpful if I could start contributing with this issue.

another approach would be

Start the process through CRIU: CRIU runs the program so it can monitor it from the start.

Intercept signals: Use something like ptrace() (like gdb does) or signalfd() to catch crash signals like SIGSEGV, SIGBUS, SIGILL, etc.

Pause the process on signal: As soon as the signal arrives, freeze the process before any cleanup happens.

Dump state immediately: Use CRIU’s dump functionality to capture memory, open files, and register state.

murshidkc avatar Mar 09 '25 11:03 murshidkc

@kasperk81 If your process creates a core file on crash you should have all the information you need. What are you looking for outside of the information of the core file?

adrianreber avatar Mar 09 '25 12:03 adrianreber

hey, i used this as an alternative to gdb’s missing time travel debugging for multi-threaded processes. when my process crashes early, i can walk the stack backwards to pinpoint the issue.

kasperk81 avatar Mar 09 '25 13:03 kasperk81

But this means you want multiple checkpoints, right? At which interval? For every instruction executed? To me it sounds much more complicated than how you describe it. One single checkpoint does not really help you if I understand you correctly.

adrianreber avatar Mar 09 '25 13:03 adrianreber

my app crashes in first few milliseconds, so it is not very convenient to pause the app, get its pid and feed it to ciru.

i used this as an alternative to gdb’s missing time travel debugging for multi-threaded processes.

I agree with Adrian; CRIU is not intended to be used as a drop-in replacement for gdb. Using tools like strace can help you to understand why your application crashes.

rst0git avatar Mar 09 '25 15:03 rst0git

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Apr 09 '25 00:04 github-actions[bot]