binaryninja-api icon indicating copy to clipboard operation
binaryninja-api copied to clipboard

Consider using the content of the .eh_frame section to assist in identifying function boundaries.

Open JANlittle opened this issue 1 year ago • 1 comments

What is the feature you'd like to have? The most of ELF files contain the .eh_frame section, which records accurate boundary information for subroutines (such as functions), including start and end. If this information can be extracted, it is believed that it can greatly improve the accuracy of identifying the function boundaries of ELF.

Is your feature request related to a problem? The existing function boundary recognition based on CFG recovery seems to be very comprehensive. However, when analyzing some obfuscated code (especially when using indirect call and indirect jump techniques), function boundary recognition based on CFG recovery loses its effectiveness. However, the information in the .eh_frame section still accurately records the true boundaries of the obfuscated function. If this information can be used to identify the boundaries of the confused function, I believe it will be beneficial for further manual or automated analysis.

As a example, this is a obfuscated sample tested on BN Personal 4.0.4958-Stable. It can be seen that function sub_41a7e0 has an error in function boundary recognition due to the presence of indirect jumps. error

However, we can extract the information of ELF's .eh_frame section through readelf -wf sample.elf. It can be observed that the actual boundary of function sub_41a7e0 is as follows: boundary

If we can accurately identify function boundaries, we can at least analyze all the basic blocks of a function for subsequent processing, just like manually setting function boundaries in IDA!

Are any alternative solutions acceptable? No. BN seems to lack the ability to manually modify function boundaries like IDA, and the recognition of function boundaries relies entirely on its automated analysis. So I haven't found a acceptable alternative solution.

Additional Information: It seems that the recognition of ELF functions in project-LIEF relies on the information in the .eh_frame section?

JANlittle avatar Apr 16 '24 15:04 JANlittle

We are definitely interested in incorporating the info from .eh_frame to improve function detection. We may follow up on this later.

Btw, just for the specific case you showed in the screenshot, I think you can set the type of the data variable at 0x67d530 to constant, and our data flow analysis will then be able to resolve the indirect jump.

xusheng6 avatar Apr 16 '24 16:04 xusheng6