dissect.cstruct
dissect.cstruct copied to clipboard
Eagerly resolve types and set as attributes on the cstruct object
Currently all type accesses (cs.uint8) go through __getattr__, and if it's not a constant, cstruct.resolve(). This is very slow. We should look into if there are any reasons why we can't resolve types eagerly and set them as instance attributes on the cstruct object. This would make type accesses a lot faster.
Random thoughts:
- There may be a specific reason we do dynamic resolutions with
.resolve(), but how big is that use case? For example, I suppose it would allow changing the typedef of a field and have that dynamically resolve at read, but this is already not possible with compiled structures (where we resolve types at compile time). To be fair, that use case is currently still possible if you opt-out your structure for compilation. - Maybe we should only do it for specific types of things, for example
enum,flag,structand constant definitions. And then we can catchtypedefwith the existing__getattr__. That way we do allow for dynamic type changes, but not for things that are supposed to be static. Since dynamic typedef'ing would be an advanced topic anyway, the intended "workaround" for that would be to usetypedefand use thetypedef'd name instead of thestructname. "Performance oriented code" could then use the rawstructandenumnames for a faster access time (properly written code utilising cstruct already does this to make the loop count in.resolve()as low as possible).
Some micro benchmarks:
from dissect.cstruct import cstruct
cdef = """
#define X 512
enum MyEnum {
A,
B,
C
};
struct test {
uint32 a;
};
"""
cs = cstruct()
cs.load(cdef)
Before:
In: %timeit getattr(t.cs, "X")
58.1 ns ± 1.04 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In: %timeit getattr(t.cs, "MyEnum")
227 ns ± 0.912 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In: %timeit getattr(t.cs, "test")
219 ns ± 1.47 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
After:
In: %timeit getattr(t.cs, "X")
31.2 ns ± 0.0331 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In: %timeit getattr(t.cs, "MyEnum")
33.1 ns ± 0.0884 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In: %timeit getattr(t.cs, "test")
31.3 ns ± 0.0571 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)