dynamorio icon indicating copy to clipboard operation
dynamorio copied to clipboard

drsym_lookup_address should not return Arm mapping symbols

Open algr opened this issue 2 years ago • 3 comments

Arm and AArch64 ABIs extend ELF with code mapping symbols to disambiguate code and data (and in the case of AArch32, they disambiguate Arm and Thumb). Symbols are generally $a, $t, $x and $d possibly extended with a suffix .xxxx.

What we are seeing is that diagnostics that call drsym_lookup_address() to find the function containing some address of interest, are (on Arm/Arm64) sometimes finding mapping symbols. This function should ignore mapping symbols. This would give consistent behavior across Arm and non-Arm architectures.

The mapping symbols might be useful for a query like "is this address Arm, Thumb, Arm64 or a literal pool?" but that's a different sort of query.

Basically, the mapping symbols aren't proper symbols, they are just there for technical reasons because that was a way to mark places in ELF - they could equally well have been relocations.

Seen on 9.0.

algr avatar Sep 10 '22 17:09 algr

Are these mapping symbols really rare? I don't see any in our libraries:

~/dr/build$ for i in $(find . -name \*.so); do readelf -s $i | grep '\$'; done

Do other tools skip them, like binutils readelf -s or nm? drsym_lookup_address for ELF is just looking through .symtab. Is there precedent for other tools that look at .symtab skipping these symbols?

derekbruening avatar Sep 12 '22 16:09 derekbruening

They are often stripped from executables and shared libraries using --strip-unneeded, but they don't have to be.

binutils has numerous places where it treats mapping symbols specially, e.g. see aarch64_symbol_is_valid() which returns false for mapping symbols. We wouldn't expect something like addr2line to resolve an address to a mapping symbol.

RISC-V also has mapping symbols, though I haven't checked whether they are allowed in executables like the Arm ones are.

algr avatar Sep 13 '22 09:09 algr

If you have examples in front of you or know how to produce them with toolchain flags perhaps you could contribute a test case and possibly the fix. Without finding any such symbols in DR debug build binaries or the system binaries it is not clear how someone else would implement and test this. Or is readelf -s already hiding them? I would expect it to dump the raw .symtab and not filter it.

derekbruening avatar Sep 13 '22 15:09 derekbruening