Prototype for an analysis interpreter.
Your checklist for this pull request
- [ ] I've read the guidelines for contributing to this repository.
- [ ] I made sure to follow the project's coding style.
- [ ] I've documented every
RZ_APIfunction and struct this PR changes. - [ ] I've added tests that prove my changes are effective (required for changes to
RZ_API). - [ ] I've updated the Rizin book with the relevant information (if needed).
- [ ] I've used AI tools to generate fully or partially these code changes and I'm sure the changes are not copyrighted by somebody else.
For review start
- Main interpretation loop is in
rz_interpreter_run. interpreter_prototype.cis the plugin.rz_inquiry_interpretergives an idea how the whole setup and API usage of an interpreter looks like.
Notes
Based on: https://github.com/rizinorg/rizin/pull/5508
Current design
This PR is only there to discuss design decisions for now.
There is no cache or anything. Just two threads with:
┌────────────┐ ┌────────────┐
│ │ Request IL op │ │
│ │◄───────────────────┤ │
│ │ │ │
│ RzInquiry │ Send IL op │ Prototype │
│ ├───────────────────►│ │
│ │ Send Yield │ │
│ │◄───────────────────┤ │
└────────────┘ └────────────┘
Questions to discuss
- Use abbreviation
intrprorinterpinstead ofinterpreterto shorten name prefixes? - Where to implement the "get_il_bb"? In the Arch plugins? Or should it be a helper function in
RzInquiry? Or the IL cache? - How to design the filter of yield queues?
- Pass function + const filter data to interpreters?
- Filter on queue receiving end? (Probably super bad for performance).
- Configuration of interpreter.
- Via plugin specific config? Like the hexagon module.
- Passed during initialization?
- Should all the interpreters be their own module?
RzInquiryis then only the binding. But can they be a sub-module ofRzInquiry?- Then what kind of plugins does RzInquiry expose?
- This PR has the interpereters as sub-modules of
RzInquiry.
- ~~The
evalcould only produce the delta between the old and new state.~~ ~~This would save a lot of unnecessary copies of registers which are never touched.~~ This would mean we can never trow away any of the earlier states.
TODO
- [ ] Send/receive queue sizes have to be configurable.
- [ ] Add a queue for simple
ut64values. Currently it only takesvoid *and we can't cast if we ever want to support 32bit systems. - [ ] We need a way to distinguish between an unimplemented IL op or a failure. Currently both return
NULL. Which is a problem because for interpretation we might just want to stop if there is an error, butNOPall unimplemented ones.- It would be nice to decide how many NOPs are introduced, since it can screw the results. Maybe the chance of sampling them can be set? And a seed for the sampling can be provided as well? So users have control.
Performance
-
The register file in an interpreter should have an access of O(1), cannot use allocations and doesn't run any hash functions. If we aim for performance our hash map implementation is too slow IMO, just by number of instructions executed (it should still be O(1)). Instead it would be nice to have all registers concatenated as one flat array. Indexed by a reg number.
-
We could need a performance friendly bit vector extension. Currently each operation on two bitvectors allocates a new one. To save the allocations we could have operations like these
// Does an in place addition of the form a += b. void rz_bv_add_inplace(RzBitevecotr *a, RzBitevecotr *b); -
Add Queue implementation which works with shared memory instead of lists. This would save allocations as well.
To Cherry-Pick into other PR
- [ ] Pre-compute DJB2 hash of VAR name for internal hash map key usage. = https://github.com/rizinorg/rizin/pull/5505/commits/282c3aae2113da3ad700d6b95ab069ca2e1026ec
- [ ] Bitvector in-place: https://github.com/rizinorg/rizin/pull/5569 Test plan
...
Closing issues
...