like-dbg
like-dbg copied to clipboard
WIP: support pwndbg as a GDB extension
NOT MERGABLE YET
I'm currently running into a
pwndbg: loaded 196 commands. Type pwndbg [filter] for a list.
pwndbg: created $rebase, $ida gdb functions (can be used with print/break)
The target architecture is set to "i386:x86-64:intel".
Reading symbols from /io/vmlinux...
add symbol table from file "/io/vmlinux"
Reading symbols from /io/vmlinux...
Breakpoint 1 at 0xffffffff82cff9b4: file init/main.c, line 849.
The program is not being run.
loading vmlinux
Remote debugging using :1234
0x000000000000fff0 in exception_stacks ()
------- tip of the day (disable with set show-tips off) -------
GDB's follow-fork-mode parameter can be used to set whether to trace parent or child after fork() calls
Exception occurred: Error: invalid literal for int() with base 10: '' (<class 'ValueError'>)
For more info invoke `set exception-verbose on` and rerun the command
or debug it by yourself with `set exception-debugger on`
Python Exception <class 'ValueError'> invalid literal for int() with base 10: '':
pwndbg> set exception-verbose on
Set whether to print a full stacktrace for exceptions raised in Pwndbg commands to True
Traceback (most recent call last):
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 164, in caller
func()
File "/home/user/pwndbg/pwndbg/symbol.py", line 96, in autofetch
for mapping in pwndbg.vmmap.get():
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 62, in get
pages.extend(kernel_vmmap_via_page_tables())
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 358, in kernel_vmmap_via_page_tables
p.lazy_init()
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 162, in lazy_init
pid = int(proc.read().strip(), 10)
ValueError: invalid literal for int() with base 10: ''
If that is an issue, you can report it on https://github.com/pwndbg/pwndbg/issues
(Please don't forget to search if it hasn't been reported before)
To generate the report and open a browser, you may run `bugreport --run-browser`
PS: Pull requests are welcome
Traceback (most recent call last):
File "/home/user/pwndbg/pwndbg/gdblib/prompt.py", line 47, in initial_hook
prompt_hook(*a)
File "/home/user/pwndbg/pwndbg/gdblib/prompt.py", line 57, in prompt_hook
pwndbg.gdblib.events.after_reload(start=cur is None)
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 234, in after_reload
f()
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 169, in caller
raise e
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 164, in caller
func()
File "/home/user/pwndbg/pwndbg/symbol.py", line 96, in autofetch
for mapping in pwndbg.vmmap.get():
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 62, in get
pages.extend(kernel_vmmap_via_page_tables())
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 358, in kernel_vmmap_via_page_tables
p.lazy_init()
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 162, in lazy_init
pid = int(proc.read().strip(), 10)
ValueError: invalid literal for int() with base 10: ''
The latest change (which made sense in the first place) shifts the error in a different direction:
pwndbg: loaded 196 commands. Type pwndbg [filter] for a list.
pwndbg: created $rebase, $ida gdb functions (can be used with print/break)
The target architecture is set to "i386:x86-64:intel".
Reading symbols from /io/vmlinux...
add symbol table from file "/io/vmlinux"
Reading symbols from /io/vmlinux...
Remote debugging using :1234
0x000000000000fff0 in exception_stacks ()
Breakpoint 1 at 0xffffffff82cff9b4: file init/main.c, line 849.
Continuing.
Exception occurred: Error: invalid literal for int() with base 10: '' (<class 'ValueError'>)
For more info invoke `set exception-verbose on` and rerun the command
or debug it by yourself with `set exception-debugger on`
Python Exception <class 'ValueError'> invalid literal for int() with base 10: '':
Breakpoint 1, start_kernel () at init/main.c:849
849 {
loading vmlinux
------- tip of the day (disable with set show-tips off) -------
Use GDB's pi command to run an interactive Python console where you can use Pwndbg APIs like pwndbg.gdblib.memory.read(addr, len), pwndbg.gdblib.memory.write(addr, data), pwndbg.gdb.vmmap.get() and so on!
Exception occurred: Error: invalid literal for int() with base 10: '' (<class 'ValueError'>)
For more info invoke `set exception-verbose on` and rerun the command
or debug it by yourself with `set exception-debugger on`
Python Exception <class 'ValueError'> invalid literal for int() with base 10: '':
After hitting continue in pwndbg the kernel boots but gdb is still broken:
^C
Program received signal SIGINT, Interrupt.
default_idle () at arch/x86/kernel/process.c:689
689 }
Exception occurred: Error: invalid literal for int() with base 10: '' (<class 'ValueError'>)
For more info invoke `set exception-verbose on` and rerun the command
or debug it by yourself with `set exception-debugger on`
Python Exception <class 'ValueError'> invalid literal for int() with base 10: '':
pwndbg> set exception-verbose on
Set whether to print a full stacktrace for exceptions raised in Pwndbg commands to True
Traceback (most recent call last):
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 164, in caller
func()
File "/home/user/pwndbg/pwndbg/symbol.py", line 96, in autofetch
for mapping in pwndbg.vmmap.get():
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 62, in get
pages.extend(kernel_vmmap_via_page_tables())
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 358, in kernel_vmmap_via_page_tables
p.lazy_init()
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 162, in lazy_init
pid = int(proc.read().strip(), 10)
ValueError: invalid literal for int() with base 10: ''
If that is an issue, you can report it on https://github.com/pwndbg/pwndbg/issues
(Please don't forget to search if it hasn't been reported before)
To generate the report and open a browser, you may run `bugreport --run-browser`
PS: Pull requests are welcome
Traceback (most recent call last):
File "/home/user/pwndbg/pwndbg/gdblib/prompt.py", line 47, in initial_hook
prompt_hook(*a)
File "/home/user/pwndbg/pwndbg/gdblib/prompt.py", line 57, in prompt_hook
pwndbg.gdblib.events.after_reload(start=cur is None)
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 234, in after_reload
f()
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 169, in caller
raise e
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 164, in caller
func()
File "/home/user/pwndbg/pwndbg/symbol.py", line 96, in autofetch
for mapping in pwndbg.vmmap.get():
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 62, in get
pages.extend(kernel_vmmap_via_page_tables())
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 358, in kernel_vmmap_via_page_tables
p.lazy_init()
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 162, in lazy_init
pid = int(proc.read().strip(), 10)
ValueError: invalid literal for int() with base 10: ''
We spoke about this on Discord, but I will document this here as well.
This fails because gdb-pt-dump that Pwndbg rely on looks for the qemu-system process PID in here: https://github.com/martinradev/gdb-pt-dump/blob/f25898adc61d60e5f30c6452b15700bbf1bd630c/pt.py#L161-L162
And you launch the two - qemu-system (the linux vm) - and the GDB - in two separate containers. As a result, they end up in two different PID Linux namespaces and so they cannot see each other's PIDs/processes.
Running the containers with --pid=host would mitigate this issue but then we end up with permission errors:
pwndbg> set exception-verbose on
Set whether to print a full stacktrace for exceptions raised in Pwndbg commands to True
Traceback (most recent call last):
File "/home/user/pwndbg/pwndbg/gdblib/events.py", line 164, in caller
func()
File "/home/user/pwndbg/pwndbg/symbol.py", line 96, in autofetch
for mapping in pwndbg.vmmap.get():
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 62, in get
pages.extend(kernel_vmmap_via_page_tables())
File "/home/user/pwndbg/pwndbg/lib/memoize.py", line 49, in __call__
value = self.func(*args, **kwargs)
File "/home/user/pwndbg/pwndbg/vmmap.py", line 358, in kernel_vmmap_via_page_tables
p.lazy_init()
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 164, in lazy_init
self.phys_mem = VMPhysMem(pid)
File "/home/user/pwndbg/gdb-pt-dump/pt.py", line 18, in __init__
self.file = os.open(f"/proc/{pid}/mem", os.O_RDONLY)
PermissionError: [Errno 13] Permission denied: '/proc/72600/mem'
That's likely because of AppArmor profile blocking write access to /proc/$pid/* files. Fwiw this policy can be seen here: https://github.com/moby/moby/blob/924edb948c2731df3b77697a8fcc85da3f6eef57/profiles/apparmor/template.go#L37-L38
So now running the container additionally with --security-opt apparmor=unconfined (basically: disabling AppArmor) should probably fix this.
But this isn't really a great solution. We don't want people to do all this.
A potential solution could be using set kernel-vmmap-via-page-tables off but this will make Pwndbg to fetch memory map information from QEMU's GDB stub and its monitor info mem command. This... should work, but may be painfully slow, as QEMU renders LOTS of those memory map information (I think it does not merge them) and also it is not super accurate as iirc they don't show whether a page is writable or executable (I don't remember which one, I think exec).
Added -ex 'set kernel-vmmap-via-page-tables off' -ex 'set exception-verbose on' for debugging purposes to the GDB invocation, leaving us with the follow command:
gdb-multiarch -q /io/vmlinux -iex 'set architecture i386:x86-64:intel' -ex 'add-symbol-file /io/vmlinux' -ex 'set kernel-vmmap-via-page-tables off' -ex 'set exception-verbose on' -ex 'target remote :1234' -ex 'break start_kernel' -ex continue -ex lx-symbols
This seems to run "fine", as in it does not fully crash. However, there's still the problem that the add-symbol-file fails for some reason. While pwndbg itself is still causing exceptions as seen here:

Single-stepping through the instructions was not painfully slow for me.
Neither solution, hacking more flags into the the docker run command, nor bypassing an intended pwndbg feature seem like a good idea just to make this work.
Holding off a merge until a clean solution is found.
Codecov Report
Base: 89.96% // Head: 89.48% // Decreases project coverage by -0.48% :warning:
Coverage data is based on head (
add315b) compared to base (aef7e64). Patch coverage: 65.78% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## main #96 +/- ##
==========================================
- Coverage 89.96% 89.48% -0.49%
==========================================
Files 18 18
Lines 1954 1968 +14
==========================================
+ Hits 1758 1761 +3
- Misses 196 207 +11
| Impacted Files | Coverage Δ | |
|---|---|---|
| start_kgdb.py | 0.00% <0.00%> (ø) |
|
| src/debuggee.py | 95.60% <42.85%> (-4.40%) |
:arrow_down: |
| src/docker_runner.py | 96.80% <71.42%> (ø) |
|
| src/debugger.py | 94.91% <100.00%> (+0.08%) |
:arrow_up: |
| src/kernel_builder.py | 98.77% <100.00%> (+1.12%) |
:arrow_up: |
| src/misc.py | 98.96% <100.00%> (+0.04%) |
:arrow_up: |
| src/tests/test_debuggee.py | 98.26% <100.00%> (+0.03%) |
:arrow_up: |
| src/tests/test_debugger.py | 100.00% <100.00%> (ø) |
|
| src/tests/test_docker_runner.py | 99.35% <100.00%> (-0.03%) |
:arrow_down: |
| src/tests/test_kernel_builder.py | 99.61% <100.00%> (ø) |
|
| ... and 1 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
For completeness, pinging @gsingh93 here as well.
With the current state of pwndbg that I pulled just now (fef5077) the situation is de-facto unchanged.
I added the following lines to my io/scripts/debugger.sh:
-ex \"set kernel-vmmap-via-page-tables off\" \
-ex \"set exception-verbose on\" \
Afterward, I ran:
./start_kgdb.py -p5 -y -v
docker run -it --rm --security-opt seccomp=unconfined --cap-add=SYS_PTRACE -v /tmp/kernel_root/linux-5.15_x86_64_:/io --net="host" like_debugger /bin/bash -c "set -e; . /home/user/debugger.sh -a x86_64 -p /io -c 0 -g /home/user/gdb_script -e pwndbg
Output:

Loading the vmlinux seems to work just fine now, or at least the earlier error above about "Cannot execute this command when target is running" is gone.
The "ValueError: invalid literal for int() with base 10: ''" still persists.
Both GDB and the QEMU kernel process need to live in the same PID namespace so that gdb-pt-dump can connect to it. If you run those as Docker containers, you need to run them with the --pid=host flag (I am not aware if there is a way to run two containers in a single but new pid namespace).
However, the fact that gdb-pt-dump is used at all when set kernel-vmmap-via-page-tables off is set, seems like a bug that we need to fix in Pwndbg.
EDIT: I just realized I repeated myself from a previous comment in regards to --pid=host; sorry! :)
@disconnect3d No worries, I just had a short little chat with @gsingh93 on the pwndbg discord the other day as he was asking whether there was a problem with remote debugging. So, I just double-checked the current situation. My apologies if this pinged you as well.
@0xricksanchez If you want to disable the vmmap page table functionality, you should do this (it's mentioned in the error message in the screenshot):
set kernel-vmmap none
But set kernel-vmmap page-tables does work fine for me on aarch64 and x86-64. On x86-64, set kernel-vmmap monitor should also work (but not on aarch64). If you're not seeing this, let me know.
set kernel-vmmap none
Seems to work for me:

set kernel-vmmap page-tables
Actually, does not work for me as I run into the same problems as before?

@gsingh93 Is there an overview of what 'features' one misses out on, depending on the set options?
For QEMU kernel, we use gdb-pt-dump that parses page tables from the guest by reading /proc/$pid/mem of QEMU process. If this does not work for you, use
set kernel-vmmap-via-page-tables offto refer to our old method of reading vmmap info frommonitor info memcommand exposed by QEMU. Note that the latter may be slower and will not give full vmmaps permission information.
I guess that boils down to the same behavior GEF currently has as in vmmap just reports the whole address space back:

Actually, does not work for me as I run into the same problems as before?
Ok, I'll try to reproduce this using like-dbg, thanks for the report.
Is there an overview of what 'features' one misses out on, depending on the set options?
Mainly it's vmmap you miss out on like you mentioned, but also anything that requires information about mappings. One example is color coding of addresses based on their permissions. But I've been using pwndbg with that config set to none for a few weeks, and there's been nothing major missing.
Actually, does not work for me as I run into the same problems as before?
Ok, I'll try to reproduce this using like-dbg, thanks for the report.
Then I recommend using a ctf challenge with a given kernel and file system, so you don't have to deal with building a full kernel and file system. On this branch here you'd want to execute
./start_kgdb.py -y -v --ctf <kernel> <filesystem>
The issue is what @disconnect3d mentioned about PID namespaces. gdb-pt-dump parses page tables by reading the memory of the qemu-system process through /proc/<pid>/mem. The exception is occurring when it tries to find the PID of qemu-system but it doesn't find it, because it's running in another container.
--pid=host will work (you can confirm that pgrep qemu-system now finds the process), but I see there is also a --pid=container:<name|id> which you can use instead: https://docs.docker.com/engine/reference/run/#pid-settings---pid
After doing that however, I was getting a permission error when trying to read /proc/<pid>/mem. To fix this, I had to add both --privileged to the command invoking docker (just CAP_PTRACE isn't enough) and run gdb as sudo from inside the container. I'm not sure why this is necessary, as when just connecting from GDB on a host to QEMU on a same host, either using sudo or setting yama.ptrace_scope to zero is enough, but that doesn't seem to be the case here.
But in any case, I think that should at least get you unblocked on this issue.