Pipe interactive
This PR includes the following updates:
- Significant improvements to the
pipe.interactive()mode. No more unusual keybindings or strange scrollbars—just pure interactivity. - A new flag that allows users to choose the action when an event is triggered during interactive mode.
- An event management system integrated with the PipeManager to handle events that occur during a pipe operation.
- A new buffer system for the PipeManager, which results in a much-improved logging system for pipes (no more one-byte logs, see image below as reference - @danmaam will be very happy) and enhances interaction speed with the process.
Note: Stress tests for pipes are kindly requested to ensure that everything continues to function correctly, due to the substantial changes introduced.
Even though it’s not the main focus of this PR, the new buffering system introduces a noticeable speed improvement. Further optimizations are possible, but the goal here was to implement a functional buffering system without extreme optimization efforts.
Still, to assess the performance, I conducted benchmarks using the following C binary:
#include <stdio.h>
int main() {
// Disable buffering on stdout
setvbuf(stdout, NULL, _IONBF, 0);
// Output 500000 lines to stdout
for (int i = 1; i <= 500000; i++) {
printf("This is line number %d\n", i);
}
return 0;
}
I executed 500k recvline() calls using libdebug dev, libdebug pipe-interactive (this PR), and pwntools. Each version/tool was tested 100 times, and the average time to perform the 500k recvline was calculated. Below is the Python code used for the benchmarks.
libdebug
from libdebug import debugger
from time import perf_counter
from tqdm import tqdm
d = debugger("./500k")
results = []
for _ in tqdm(range(100)):
r = d.run()
d.cont()
start = perf_counter()
for __ in range(500000):
r.recvline()
end = perf_counter()
results.append(end - start)
d.kill()
print("Average time: ", sum(results) / len(results))
pwntools
from pwn import *
from time import perf_counter
from tqdm import tqdm
results = []
for _ in tqdm(range(100)):
r = process("./500k")
start = perf_counter()
for __ in range(500000):
r.recvline()
end = perf_counter()
results.append(end - start)
r.kill()
print("Average time: ", sum(results) / len(results))
Results
libdebug dev
Average time: ~46 seconds
Total test time: 1:16:54
pwntools
Average time: ~8.89 Total test time: 14:49
libdebug (this PR)
Average time: ~0.835
Total test time: 1:23
I think we need to revert this @io-no Slow test suite on my PC before merge:
➜ test git:(dev) git checkout 52d5dcf3991fb3c4705f68b82118968c9a8619d0
Note: switching to '52d5dcf3991fb3c4705f68b82118968c9a8619d0'.
[...]
Slowest test durations
----------------------------------------------------------------------
15.183s test_deep_dive_division (scripts.deep_dive_division_test.DeepDiveDivisionTest.test_deep_dive_division)
14.407s test_vmwhere1 (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1)
7.534s test_vmwhere1_callback (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1_callback)
----------------------------------------------------------------------
Ran 217 tests in 60.892s
After merge:
Slowest test durations
----------------------------------------------------------------------
66.529s test_deep_dive_division (scripts.deep_dive_division_test.DeepDiveDivisionTest.test_deep_dive_division)
42.189s test_vmwhere1_callback (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1_callback)
12.598s test_vmwhere1 (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1)
----------------------------------------------------------------------
Ran 217 tests in 146.638s
Okay, I pushed a hotfix on the same branch. My tests did not account for the scenario where, during recvline, a callback is triggered repeatedly. The tests you provided were of this type, instead. Moreover, on my computer, the execution times for the slow test suite are much shorter, making the difference less noticeable.
The problem was due to the concurrency between the CPU-intensive callback and the CPU-intensive memory read caused by the loop introduced to optimize performance (compared to the old selector that introduced delays in some cases). This is probably related to GIL issues.
I have now reintroduced a small selector in a specific case, which should resolve the issue with recvline and the callback occurring simultaneously.
On my computer, I achieved slightly better performance than with the commit you provided. Although the difference is not significant, these are tests where the bottleneck is the callbacks, so it's still a positive result.
This change does not affect the performance improvements highlighted in previous messages about the other benchmarks but merely avoids the GIL problem.
Could you also test this on your modern, high-end PC to confirm the results? @MrIndeciso
Thank you
Slowest test durations
----------------------------------------------------------------------
13.857s test_deep_dive_division (scripts.deep_dive_division_test.DeepDiveDivisionTest.test_deep_dive_division)
11.801s test_vmwhere1 (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1)
6.794s test_vmwhere1_callback (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1_callback)
----------------------------------------------------------------------
Ran 216 tests in 56.606s
Issue solved @io-no
@MrIndeciso, could you run a couple more tests?
I added a test with recv + callback in commit 6b2938a86465bd0bd7911a159724351ca960278c. If you run it, you might encounter some issues due to the GIL. In commit edf8eb43f460ad3767fb2b4a395c301a5d480a5a, I applied the same fix for recv as I did for recvline. This should improve performance.
You should see the timing for the new test in line with the recvline test. Thank you.
On the latest commit I get
Slowest test durations
----------------------------------------------------------------------
14.879s test_deep_dive_division_recvline (scripts.deep_dive_division_test.DeepDiveDivisionTest.test_deep_dive_division_recvline)
14.644s test_deep_dive_division_recv (scripts.deep_dive_division_test.DeepDiveDivisionTest.test_deep_dive_division_recv)
13.143s test_vmwhere1 (scripts.vmwhere1_test.Vmwhere1Test.test_vmwhere1)
Looks fine to me now.