claripy
claripy copied to clipboard
Fix ast/base.py hash issues
This fixes multiple issues:
- Non-integer (and non-
str)._hashvalues for ASTs. Notice how https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L226 can propagate to https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L239 Fundamentally, if._hashis never used as a hash of the AST, this isn't an issue; i.e. if onlyhash(my_ast)is used it is ok because we do convert in that function on demand. The problem is, that._hashis used as the hash. - Differing behavior if annotations exist / do not. At this if statement https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L218 notice we have special cases for floats like fixing the problem that
hash(0.0) == hash(-0.0)here: https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L221 But this only happens ifnot annotations. - Fix anything that uses
_calc_hashto determine what key to enter into a hash cache for looking up existing ASTs; since for some cases our cache uses the tuples made via our fast-path here: https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L218 Ex: https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L253 - Reduce code duplication between
Base.__new__andBase.__init_with_annotations__. - Fix a bug in
Base.__init_with_annotations__where we didn't use the leaf cache at all and instead were looking for leaves in the non-leaf cache: https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L251 - Non-reproducible hashes; since
self._hashwas not always calculated via_calc_hashbut sometimes as a tuple (see point 1). - When deserializing via
base.py's_dfunction https://github.com/angr/claripy/blob/51d17532025b87fd7d6eafa6ca87fa23df4b800d/claripy/ast/base.py#L58 this bypasses the leaf cache and incorrectly stores the leaves in the non-leaf cache. - Serialization is incomplete; it looses information such as
+0.0vs-0.0; thus deserialization may be incorrect. - (Unknown if applicable) If
_dalways takes an integer (orstr) as a hash, deserialization will lead to ASTs with corrupt_hashs because of point 1.
Before merging:
- [ ] Add fastpath for annotation-free children (this was removed temporarily to get hashing work first)
Linked: https://github.com/angr/binaries/pull/101
Unit Test Results
94 files + 84 94 suites +84 1h 49m 17s :stopwatch: + 1h 48m 49s 1 436 tests +1 130 1 342 :heavy_check_mark: +1 096 90 :zzz: +30 0 :x: ±0 4 :fire: +4 1 442 runs +1 136 1 348 :heavy_check_mark: +1 102 90 :zzz: +30 0 :x: ±0 4 :fire: +4
For more details on these errors, see this check.
Results for commit 17f91514. ± Comparison against base commit adbf300b.
:recycle: This comment has been updated with latest results.
Ping @ltfish Regenerate all caches please
The reason why you couldn't regenerate caches locally: https://github.com/angr/claripy/blob/fix/hash-fail/claripy/ast/base.py#L250
Can you fix it (by removing the breakpoint() call) and then try generating caches across all projects again?
ROP caches updated. You will want to rebase this branch on master to re-trigger the CI.
@ltfish The same issues I was seeing before still seem to occur in the CI.
@zwimer what issue?
The four decompiler test cases fail because the fix/hash-fail branch in angr/binaries misses some new binaries that I pushed over the weekend. So they are totally expected.
You should only pay attention to the "push" tests since "PR" tests do not synchronize repos based on the same branch name.
@ltfish The push CI is failing too: https://github.com/angr/claripy/actions/runs/3841788720/jobs/6576714321
@ltfish The push CI is failing too: https://github.com/angr/claripy/actions/runs/3841788720/jobs/6576714321
That's exactly what my reply was about.