rizin icon indicating copy to clipboard operation
rizin copied to clipboard

Prototype for an analysis interpreter.

Open Rot127 opened this issue 1 month ago • 1 comments

Your checklist for this pull request

  • [ ] I've read the guidelines for contributing to this repository.
  • [ ] I made sure to follow the project's coding style.
  • [ ] I've documented every RZ_API function and struct this PR changes.
  • [ ] I've added tests that prove my changes are effective (required for changes to RZ_API).
  • [ ] I've updated the Rizin book with the relevant information (if needed).
  • [ ] I've used AI tools to generate fully or partially these code changes and I'm sure the changes are not copyrighted by somebody else.

For review start

  • Main interpretation loop is in rz_interpreter_run.
  • interpreter_prototype.c is the plugin.
  • rz_inquiry_interpreter gives an idea how the whole setup and API usage of an interpreter looks like.

Notes

Based on: https://github.com/rizinorg/rizin/pull/5508

Current design

This PR is only there to discuss design decisions for now.

There is no cache or anything. Just two threads with:

┌────────────┐                    ┌────────────┐
│            │  Request IL op     │            │
│            │◄───────────────────┤            │
│            │                    │            │
│ RzInquiry  │  Send IL op        │ Prototype  │
│            ├───────────────────►│            │
│            │  Send Yield        │            │
│            │◄───────────────────┤            │
└────────────┘                    └────────────┘

Questions to discuss

  • Use abbreviation intrpr or interp instead of interpreter to shorten name prefixes?
  • Where to implement the "get_il_bb"? In the Arch plugins? Or should it be a helper function in RzInquiry? Or the IL cache?
  • How to design the filter of yield queues?
    • Pass function + const filter data to interpreters?
    • Filter on queue receiving end? (Probably super bad for performance).
  • Configuration of interpreter.
    • Via plugin specific config? Like the hexagon module.
    • Passed during initialization?
  • Should all the interpreters be their own module? RzInquiry is then only the binding. But can they be a sub-module of RzInquiry?
    • Then what kind of plugins does RzInquiry expose?
    • This PR has the interpereters as sub-modules of RzInquiry.
  • ~~The eval could only produce the delta between the old and new state.~~ ~~This would save a lot of unnecessary copies of registers which are never touched.~~ This would mean we can never trow away any of the earlier states.

TODO

  • [ ] Send/receive queue sizes have to be configurable.
  • [ ] Add a queue for simple ut64 values. Currently it only takes void * and we can't cast if we ever want to support 32bit systems.
  • [ ] We need a way to distinguish between an unimplemented IL op or a failure. Currently both return NULL. Which is a problem because for interpretation we might just want to stop if there is an error, but NOP all unimplemented ones.
    • It would be nice to decide how many NOPs are introduced, since it can screw the results. Maybe the chance of sampling them can be set? And a seed for the sampling can be provided as well? So users have control.

Performance

  • The register file in an interpreter should have an access of O(1), cannot use allocations and doesn't run any hash functions. If we aim for performance our hash map implementation is too slow IMO, just by number of instructions executed (it should still be O(1)). Instead it would be nice to have all registers concatenated as one flat array. Indexed by a reg number.

  • We could need a performance friendly bit vector extension. Currently each operation on two bitvectors allocates a new one. To save the allocations we could have operations like these

    // Does an in place addition of the form a += b.
    void rz_bv_add_inplace(RzBitevecotr *a, RzBitevecotr *b);
    
  • Add Queue implementation which works with shared memory instead of lists. This would save allocations as well.

To Cherry-Pick into other PR

  • [ ] Pre-compute DJB2 hash of VAR name for internal hash map key usage. = https://github.com/rizinorg/rizin/pull/5505/commits/282c3aae2113da3ad700d6b95ab069ca2e1026ec
  • [ ] Bitvector in-place: https://github.com/rizinorg/rizin/pull/5569 Test plan

...

Closing issues

...

Rot127 avatar Nov 05 '25 11:11 Rot127