claripy
claripy copied to clipboard
`Bits`' `__init__` changes `.length` after hashes are calculated
When creating a Bits AST object, the __init__ function sets self.length; in some cases this does not change anything, but in some cases the newly set length is different from the old length.
This occurs after Base's __new__ runs, and thus it might change the internals of a base, after Base calculates the hash of the object to store it inside the lookup cache.
Note that .length is a HASHCONS variable, it affects the hash.
Actually, this looks like an interaction with strings.py's __init__.
Note that here strings edits kwargs:
https://github.com/angr/claripy/blob/696f1c08c2996c5350867685462d1245070a996d/claripy/ast/strings.py#L31
Bits' self.length = kwargs['length'] would be a no-op if kwargs were not edited between Bits __init__ and Base __new__.
Suggested solution:
- Refactor all String ops not to use bit lengths rather than byte lengths.
- Remove
kwargs['length'] *= 8fromString's__init__(really this is part of 1) - Remove
self.length = kwargs['length']fromBits__init__. If this is done without doing 1 & 2 then string support will break
This can be triggered from within Base's new function via the concrete backend constructing an object, it being returned from new, then edited after __init__ runs.
Note: fixing it might require refactoring Base's __new__ method as well. Specifically, Base can call the concrete backend internally to construct an object. But Backend.call only takes in op and args. For some ops, length is not stored in either. For example, a StringV might have args ('12', 2), but length 80000- such as the result of an simplified IntToStr operation (you can find this is test_backend_smt.py).
Since Backend.call cannot accept args beyond op and args, the StringV to be constructed will have the size of .args[1] * 8 (the bit-length of the python str held inside the StringV) rather than the StringV length of .length.
Suggested fix: Make StringV and any other ops that fall into this category hold both lengths or only the .length length. Alternatively call could take in kwargs, but that feels like it is inviting abuse.
Proposed fixes:
- In
def StringVinast/strings.py, change the following line from/to: https://github.com/angr/claripy/blob/acc6e380f22fdd74af6c4839cbf8fa738af2f0fc/claripy/ast/strings.py#L154
result = String("StringV", (value, length), length=8*length, **kwargs)
- Prevent
String.__init__from changingkwargs['length']. - Prevent
Bits.__init__from changingself.length.
How this works:
- The
Stringconstructor now takes in the byte length asarg[1], which means things like theBackend.callcan use the proper length rather than the length of the python string with implicit trailing zeros ignored. Additionally, thekwargs['length']passed toBase.__new__will be in bits from the start. - No longer interact to edit the length after it's been set in
Base.__new__and hashed - Same for point 3.
What this will change:
my_string.length will be in bits; for bytes, use my_string.str_length or my_string.args[1] (both should be byte length)
Additionally, StringS must also 8* length.