feat(r2): human-friendly disassembler
Checklist
Which kind of PR do you create?
- [ ] This PR only contains minor fixes.
- [x] This PR contains major feature update.
- [x] This PR introduces a new function/api for Qiling Framework.
Coding convention?
- [x] The new code conforms to Qiling Framework naming convention.
- [x] The imports are arranged properly.
- [x] Essential comments are added.
- [] The reference of the new code is pointed out.
Extra tests?
- [ ] No extra tests are needed for this PR.
- [ ] I have added enough tests for this PR.
- [x] Tests will be added after some discussion and review.
Changelog?
- [ ] This PR doesn't need to update Changelog.
- [x] Changelog will be updated after some proper review.
- [ ] Changelog has been updated in my PR.
Target branch?
- [x] The target branch is dev branch.
One last thing
- [x] I have read the contribution guide
This is part of the further work discussed in #1172
r2 has pD to disassemble N bytes and pd to disassemble N instructions directly. Though the latter may be used more often, counting bytes is more compatible with the callback type ql.hook_code() expects, like QlArchUtils.disassembler.
Now there is no change in qiling core, so the code looks a bit ugly. If we can integrate r2 into qiling, the feature can be implemented with only a few lines and no attention of users.
I am not sure why do we need disassembling capabilities based on radare2. Given that radare2 itself relies on Capstone for disassembly, I don't see a clear advantage here (unless we strive to base as many things as we can on top of radare2..?)
@elicn Yes, r2 also relies on capstone for disassembly. But it can provide results with addresses resolved as flags directly.
For example, now we can only see disasm call 0x004019, but r2 can know it means call main.
Note that now we get disasm directly by pd instead of resolving address as flag manually. We strive to base as many things as we can on top of radare2, not limited to symbol resolving.
We strive to base as many things as we can on top of radare2, not limited to symbol resolving
Where did that come from..? Not sure that I agree with this.
Anyway, this is indeed a nice disassembly feature. The only concern I can think of here is how a user may tweak radare2 analysis in order to get more accurate info where radare2 guesses wrong.
We strive to base as many things as we can on top of radare2, not limited to symbol resolving
Where did that come from..? Not sure that I agree with this.
Anyway, this is indeed a nice disassembly feature. The only concern I can think of here is how a user may tweak radare2 analysis in order to get more accurate info where radare2 guesses wrong.
Mostly from cross-reference analysis.
Mostly from cross-reference analysis.
I am not following.
In the latest commits, I integrate r2 extension into qiling core again, but as cached_property. To demonstrate the advantage of introducing r2, I modify the breakpoint part of qdb, now it can set breakpoints using main+3 instead of integer addresses only.

The tests fail since libr is sitll optional dependency.
It seems I ignore the fact that r2 can only provide disasm for binary itself, but not the libraries, so it should fallback to the existing disassembler approach when disasm is not valid.
You can try to create multiple r2 instances for libraries.
From: chinggg @.> Sent: Thursday, July 28, 2022 1:47:18 PM To: qilingframework/qiling @.> Cc: lazymio @.>; Comment @.> Subject: Re: [qilingframework/qiling] feat(r2): enable resolved disassembly (PR #1196)
It seems I ignore the fact that r2 can only provide disasm for binary itself, but not the libraries, so it should fallback to the existing disassembler approach when disasm is not valid.
― Reply to this email directly, view it on GitHubhttps://github.com/qilingframework/qiling/pull/1196#issuecomment-1197692360, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHJULOYGRV6QQKHDNDEI7O3VWINGNANCNFSM54DSYRVQ. You are receiving this because you commented.Message ID: @.***>
@wtdcode I have tried to use r2 on msvcrt.dll, it's extremely slow. I must wait for 10s+ for r2 to analyze a single library. Moreover, since the library does not provide debug info, it's also a bit useless to see things like fcn.0x40abcd. So I make r2 return the address for disasembler to continue disasm manually.
Anyway, multiple r2 instances for libraries can be supported later if it is useful.
@wtdcode I have tried to use r2 on
msvcrt.dll, it's extremely slow. I must wait for 10s+ for r2 to analyze a single library. Moreover, since the library does not provide debug info, it's also a bit useless to see things likefcn.0x40abcd. So I make r2 return the address for disasembler to continue disasm manually.
fcn.0x40abcd is also useful in many cases instead of a numeric address, say you are moving it as a function pointer:
mov rax, fcn.0x40abcd ; we know it's a function
instead of
mov rax, 0x40abcd ; what's this magic value??
also this would be helpful if we would like a CFG.
For the load time, we could
- analyze less on Windows, say
aainstead ofaaaa - I'm not sure how much time it takes to load
msvcrt.dllinto unicorn. If that already takes a long time, I don't think we bother a bit longer. - If the time is indeed too long for users, we can disable this feature on Windows emulation by default but enable it on other systems.
Note that you cannot analyze the binaries and libraries from disk, because they might have been patched by Qiling (ql.patch). You should communicate a patched version to radare2.
@wtdcode @elicn Thanks for mentioning multiple r2 instances and patched binaries. I think we can solve it later but not here, since we really have to take many things into consideration, such as code organization and consistency, handling of extremely slow corner cases (aa libc.so takes me 10s+ on my PC, crafted binaries can even hang for hours) and showcase examples.
This PR focuses on enhanced disassembly, and I want to get it merged quickly. Since it contains not only feature enhancement, but also bug fixes. For example, I didn't realize the misused@staticmethod can cause exception since it works well in Python 3.10, see fix in https://github.com/qilingframework/qiling/pull/1196/commits/6130dd37baad9f0808b20d88401733f5d44b5222
Now QlArchUtils.disassembler is smoothly enhanced by r2, you can test it with hello_r2 example, it should print same amount of disassembly. So I think time is ripe to make r2 required dependency, since core.py will always import r2 now.
If you think these small commits are confusing, I can squash them into some major commits.
Wait. Let's not jump the gun here. There are few considerations I would like us all to take into account:
- Implementing an extension is one thing, and integrate that into core and make it a dependency is totally different. Qiling does not widely rely on radare2 to perform analysis (yet). Moving toward this direction would require a lot of thinking and design decisions. Until we do, there is no justification to make radare2 an integral dependency for Qiling.
- Though the enhanced disassembly is a cool feature, I don't think it is ready. There are too many "corner cases" which fail the analysis. Patched binaries are one, but self-modifying code and running code from stack (like every other exploit out there does) would also fail. This is exactly where static meets dynamic, and we have to come up with the right design to work around that.
- Qiling has too many "half baked" features that somebody introduced to the project once, and then left them there. Most of my efforts are dedicated to fixing [half-]broken implementations rather than introducing new features, so I must say I am very much against introducing another half-working feature (no matter how cool it is). Taking the time to think and design a robust solution is crucial.
All that being said, my opinion is that the radare2-based analysis should appear as an extension until we figure out how to leverage it properly. People would still be able to develop features on top of that extension (e.g. the enhanced disassembly) without turning it into a core feature. Once we figure out how to harness radare2 analysis and work around the issues, we would make it a core feature.
Wait. Let's not jump the gun here. There are few considerations I would like us all to take into account:
- Implementing an extension is one thing, and integrate that into core and make it a dependency is totally different. Qiling does not widely rely on radare2 to perform analysis (yet). Moving toward this direction would require a lot of thinking and design decisions. Until we do, there is no justification to make radare2 an integral dependency for Qiling.
The final goal of the project is indeed to make r2 being used fluently in Qiling. Say, we no long rely on various magic addresses or offsets, instead we will have a good reference system (provided by r2).
- Though the enhanced disassembly is a cool feature, I don't think it is ready. There are too many "corner cases" which fail the analysis. Patched binaries are one, but self-modifying code and running code from stack (like every other exploit out there does) would also fail. This is exactly where static meets dynamic, and we have to come up with the right design to work around that.
I agree with this point. @chinggg Last time I talked about this, maybe we really have to be able to sync r2 memory with Qiling (Unicorn) memory somehow. Pancake also once talked about it with me, a clean way is to write a new r_io_plugin for unicorn using the uc_mem_map_ptr. But for now, I think manually sync the memory is an acceptable choice.
- Qiling has too many "half baked" features that somebody introduced to the project once, and then left them there. Most of my efforts are dedicated to fixing [half-]broken implementations rather than introducing new features, so I must say I am very much against introducing another half-working feature (no matter how cool it is). Taking the time to think and design a robust solution is crucial.
All that being said, my opinion is that the radare2-based analysis should appear as an extension until we figure out how to leverage it properly. People would still be able to develop features on top of that extension (e.g. the enhanced disassembly) without turning it into a core feature. Once we figure out how to harness radare2 analysis and work around the issues, we would make it a core feature.
I agreed with this statement, but not fully. We should be careful about new features indeed but there will be tradeoff. Being an extension means not every user could notice and use it and then provide enough feedback. While introducing r2 means a very big conceptional change (remember we now have CFG, we can use flags instead of magic addresses), this needs more feedback and being an extension make it hard to understand and use.
I agree that we really need well-design decisions when integrating r2 into qiling. As @wtdcode mentioned, syncing r2 memory with Qiling (Unicorn) memory is important for building our flag-address system, maybe we need to manipulate r2 binary manually.
Qiling has too many "half baked" features that somebody introduced to the project once, and then left them there. Most of my efforts are dedicated to fixing [half-]broken implementations rather than introducing new features
Thanks for your efforts to keep the project robust and healthy! Maybe qdb is just one of the "half baked" features. I even wonder why it is not an external extension :joy_cat: Now I have made some commits to enhance qdb by utilizing flag-address ability provided by r2. Can you imagine that qdb only supports b 0x400120 but not b main+3? But with r2 integrated, the latter can be implemented with just a few lines of code! (I admit that's not always the case, we will pay for the lack of caution). This can be viewed as an example of resurrection that happened on other part of qiling after introducing r2. It's not easy to do so if r2 is not part of qiling.
agreed with this statement, but not fully. We should be careful about new features indeed but there will be tradeoff. Being an extension means not every user could notice and use it and then provide enough feedback. While introducing r2 means a very big conceptional change (remember we now have CFG, we can use flags instead of magic addresses), this needs more feedback and being an extension make it hard to understand and use.
radare2 support is still an experimental feature, and as such it should stay out of core. Making it an extension would be a great solution: it's there for everyone to experiment with, but also can be avoided if someone doesn't feel experimental and prefer stability.
You two try to convience me that radare2 analysis is needed and has great potential, and I'm not sure why: I already said it myself. I don't think we shouldn't have it, all I'm saying is that it is curently immature and needs to go through more thinking and deveopment before we can integrate radare2 analysis into Qiling.
For now the radare2-based analysis is a nice PoC; not something we can integrate to core.
Atfer https://github.com/qilingframework/qiling/pull/1196/commits/bc516a9d8a330cb176e0f5c8e09901f4d1bdd062, r2 is sill optional dependency and partially integrated into qiling core only if r2libr is installed. Users can use qiling as usual if they don't install r2libr. Please let me know if your concern is code organization or try...except performance when r2 is not installed.
agreed with this statement, but not fully. We should be careful about new features indeed but there will be tradeoff. Being an extension means not every user could notice and use it and then provide enough feedback. While introducing r2 means a very big conceptional change (remember we now have CFG, we can use flags instead of magic addresses), this needs more feedback and being an extension make it hard to understand and use.
radare2 support is still an experimental feature, and as such it should stay out of core. Making it an extension would be a great solution: it's there for everyone to experiment with, but also can be avoided if someone doesn't feel experimental and prefer stability.
You two try to convience me that radare2 analysis is needed and has great potential, and I'm not sure why: I already said it myself. I don't think we shouldn't have it, all I'm saying is that it is curently immature and needs to go through more thinking and deveopment before we can integrate radare2 analysis into Qiling.
For now the radare2-based analysis is a nice PoC; not something we can integrate to core.
Okay, make sense and I think we are sync-ed here. The latest commit will detect the ImportError which I think is good to go.
To keep a clean code, this feature needs to be completely detached from core - just like other experimental features, that are not plugged-in to core at all. To let it take over the disassemble method, the module can monkeypatch the utils module once it is imported, instead of try / catch each and every assembly instruction.
Already detach r2 extension from qiling core, r2.disassembler now monkey patch QlArchUtils.disassembler
That looks better, thanks.
My suggestion to monkeypatch the disassemble method meant to replace the default disasembly hook that is set when the user sets the verbosity level to QL_VERBOSE.DISASM. That could have take place when the new r2 module imported, and provide a transparent way to go with the new feature.
Anyway, the current method is fine too.
Since we have approved the current change, can we merge this PR and move fast for further work like backtrace, xref, CFG and memory consistency?
@wtdcode, do you approve it with the latest chances?
@wtdcode, do you approve it with the latest chances?
Sorry too busy these days, will check today later.
@wtdcode You approve the changes and close this PR, so you want me to work on a dedicated branch and submit a big PR?
What happened..? I am wondering the same thing..
@chinggg @elicn My bad, I intended to merge but close it wrongly. Sorry to busy these days.