ghidra
ghidra copied to clipboard
Apple M1 / AArch64 .data section not recognised as such
Discussed in https://github.com/NationalSecurityAgency/ghidra/discussions/3658
Originally posted by p-Wave November 20, 2021 Hi all,
I have the following "Hello World" code:
.global _start
.align 2
.text
_start: mov X0, 1
adrp X1, helloworld@PAGE
mov X2, 13
mov X16, 4
svc 0x80
mov X0, 0
mov X16, 1
svc 0x80
.data
helloworld: .ascii "Hello World!\n"
which I compile with
Apple clang version 13.0.0 (clang-1300.0.29.3)
Target: arm64-apple-darwin21.1.0
the CodeBrowser in Ghidra doesn't recognise the data section, but instead gives me the following interpretation (starting at 0x20) :
//
// __text
// __TEXT
// ram:00000000-ram:0000001f
//
**************************************************************
* *
* FUNCTION *
**************************************************************
undefined ltmp0()
undefined w0:1 <RETURN>
_start XREF[1]: Entry Point(*)
ltmp0
00000000 20 00 80 d2 mov x0,#0x1
00000004 01 00 00 90 adrp x1,0x0
00000008 a2 01 80 d2 mov x2,#0xd
0000000c 90 00 80 d2 mov x16,#0x4
00000010 01 10 00 d4 svc 0x80
00000014 00 00 80 d2 mov x0,#0x0
00000018 30 00 80 d2 mov x16,#0x1
0000001c 01 10 00 d4 svc 0x80
//
// __data
// __DATA
// ram:00000020-ram:0000002c
//
ltmp1
helloworld
00000020 48 65 6c 6c ldnp d8,d25,[x10, #-0x140]
00000024 6f 20 57 6f umlal2 v15.4S,v3.8H,v7.H[0x1]
00000028 72 ?? 72h r
00000029 6c ?? 6Ch l
0000002a 64 ?? 64h d
0000002b 21 ?? 21h !
0000002c 0a ?? 0Ah
What am I missing/ doing wrong?
Thank you very much!
Looks to be an issue with the way we are handling the SVC
instruction. I assume it is not meant to return in your example right?
svc
is treated like a supervisor-level call instruction which in many uses may return and continue. This specific case is like calling a non-returning function. Unfortunately, Ghidra's AARCH64 svc
semantics use a fall-through pcodeop CallSupervisor
which prevents a flow-override from being applied. Ideally, the semantics for this instruction would be changed to give it a call flow which would allow a non-returning flow-override to be applied in this case.
This same potential issue also applies to the instructions hvc
and smc
.
Similar issue also applies to ARM where the svc
and swi
instructions use the software_interrupt
fall-through pcodeop.
From your snippet, it looks like the .data
section is being recognized by Ghidra. Ghidra will follow flow while disassembling, even if that takes it into a non-executable section. In this case, it's following flow that doesn't actually exist.
The issue is that the first instance of the svc
instruction is a call to write
and the second is a call to exit
. exit
is a non-returning function, so bytes after calls to it shouldn't be disassembled in general. Basically, knowing what the svc
instruction does involves more than the bytes of the instruction - it also depends on the "environment" of the program. We're working on adding a system call analyzer which would automatically figure all of this out during analysis. This is still very much a work in progress.
At the moment, system calls can in some cases be handled manually or by a script - see ResolveX86orX64LinuxSystemCalls.java
for an example. Unfortunately this will only work nicely if the pcode for the relevant instruction is in a certain form. In this case it's not, but I'll try to get a fix in for that. Once that's in place you could follow the script to handle the system calls.
There's some related discussion in #3936.