ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

ARM v4T does not understand undocumented, partially supported `blx lr` instruction.

Open alercah opened this issue 2 years ago • 2 comments

Describe the bug The ARM v4T ISA had a deficiency that made cross-ISA function calls to a far address difficult to perform. There were a number of workarounds, but one of them was clever and relied on processor-specific behaviour:

  1. Load the target address into the lr register.
  2. Execute the second half (only) of a blx #0x0 instruction pair. (See page A7-26 of https://documentation-service.arm.com/static/5f8dacc8f86e16515cdb865a?token=)

Per the spec, this results in UNPREDICTABLE behaviour if the second half of the instruction pair is executed on its own, but given that I have seen it in real-world code, it appears that some implementations of v4T, if not all, produced reliable behaviour here with the instruction acting as a blx lr.

Ghidra does not disassemble this undocumented instruction correctly.

To Reproduce

  1. Load a program containing 0xf800 at a halfword-aligned boundary, using the ARM v4T ISA.
  2. Attempt to disassemble the code, as that Thumb, starting at that byte sequence or before it.

Expected behavior The instruction is disassembled as blx lr and Ghidra correctly understands this to be a function call to the address manually loaded into lr.

Actual behavior The instruction is disassembled as bl #0x0, and disassembly stops at that instruction rather than continuing past it.

Workaround This can be worked around by manually overriding the instruction flow to CALL, and then continuing disassembly starting from the next instruction.

Screenshots Actual: image

After applying workaround: image

Attachments I cannot provide a copy of the program I am working on, unfortunately. My apologies.

Environment (please complete the following information):

  • OS: Debian Testing
  • Java Version: OpenJDK 11.0.14.1
  • Ghidra Version: e.g. 10.1.6
  • Ghidra Origin: Ghidra website

Additional context I can provide more detail about why someone might have used this instruction if needed, but it doesn't seem relevant.

alercah avatar Jun 07 '22 17:06 alercah

I'm still trying to process exactly what's going on with the ARM v4t. That bl instruction should almost certainly be a call, but we have it as a goto. It's probable that it's used in some ARM code for switch statements.

For your example here, is r11 loaded with a constant or an offset from the current position?

GhidorahRex avatar Jun 07 '22 19:06 GhidorahRex

The bl instruction in armv4t is actually a macro for two instructions. They are not usually used individually, but for the spec (and also implementing hardware) this is totally fine.

The arm reference https://developer.arm.com/documentation/ddi0308/d/Thumb-Instructions/Alphabetical-list-of-Thumb-instructions/BL--BLX--immediate-?lang=en explains that in more detail:

The instructions could be executed as two separate 16-bit instructions, with the first instruction instr1 setting LR to PC + SignExtend(instr1<10:0>:'000000000000', 32) and the second instruction completing the operation. This is no longer possible in Thumb-2.

E: In the example above r11 contains a call destination address, the instruction that follows (0xF800) can be read as bl lr. It should not be read as a goto, because the instruction specifically sets LR again after completing the jump sequence.

SBird1337 avatar Jun 07 '22 21:06 SBird1337

@alercah Do you have a binary we could test a fix with? I want to make sure we're emulating the two pseudoinstructions correctly now.

GhidorahRex avatar Jan 19 '23 15:01 GhidorahRex

Not one that I can share, unfortunately.

On Thu, Jan 19, 2023, 10:16 GhidorahRex @.***> wrote:

@alercah https://github.com/alercah Do you have a binary we could test a fix with? I want to make sure we're emulating the two pseudoinstructions correctly now.

— Reply to this email directly, view it on GitHub https://github.com/NationalSecurityAgency/ghidra/issues/4320#issuecomment-1397134319, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7AOVINYJ3ZDOKECIOBUXDWTFLFBANCNFSM5YDYVADQ . You are receiving this because you were mentioned.Message ID: @.***>

alercah avatar Jan 19 '23 20:01 alercah