ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Decompiler doesn't honor x86-64 ABI for entry point

Open ubitux opened this issue 3 years ago • 2 comments

Describe the bug An x86-64 ELF binary expects the _start entry point to have 2 registers initialized (rdx, and rsp with argc/argv/envp), which conflicts with the x86-64 calling convention (rdi, rsi, rdx, rcx, r8, r9).

Since the entry point is usually reading rdx (some function handler, as expected from the ABI), Ghidra decompiler incorrectly assumes there must be rdi and rsi set (as per the calling convention) and thus consider there are 3 parameters.

Note: this is less problematic in x86 32-bit because parameters are expected to be passed on the stack, edx doesn't really matter. The 64-bits ABI naively changed edx into rdx and that created a conflict with the calling convention.

To Reproduce Steps to reproduce the behavior:

  1. Create a test.c with int main() { return 0; } and compile it with cc -s -O2 test.c -o test
  2. Load it, jump to entry
  3. Observe decompiler prototype

Expected behavior The entry point prototype should respect the expected calling ABI

Screenshots 2022-10-16-224701-Eif9OCho

Environment (please complete the following information):

  • OS: Archlinux
  • Java Version: 11.0.16.1
  • Ghidra Version: 10.1.5
  • Ghidra Origin: Archlinux official packages

ubitux avatar Oct 16 '22 20:10 ubitux

Thanks for pointing this out. Should be an easy fix. I should check whether the System V ABI specifies special calling conventions for entry points on other architectures. If you happen to know of any offhand please let me know.

ghidracadabra avatar Oct 21 '22 15:10 ghidracadabra

Thanks for pointing this out. Should be an easy fix. I should check whether the System V ABI specifies special calling conventions for entry points on other architectures. If you happen to know of any offhand please let me know.

I essentially used these 2 sources to figure this out:

  • glibc sysdeps/x86_64/start.S: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/start.S;h=a2aea58b0e0b64a2b1931c9ffb799aa7141dbe5f;hb=c804cd1c00adde061ca51711f63068c103e94eef#l36
  • kernel arch/x86/include/asm/elf.h: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/include/asm/elf.h?h=v6.0.3#n97

Both comments seem to refer to the same specs (comment is mostly copied from 32-bit): "SVR4/i386 ABI (pages 3-31, 3-32)". That being said, I wasn't able to find the exact information it was referring to in this doc (but I looked quickly, and might have not read the exact same document either).

Edit: I don't know for other arch, sorry Edit2: this is what glibc says about aarch64 in sysdeps/aarch64/start.S (you can probably find all the other that way):

/* This is the canonical entry point, usually the first thing in the text
   segment.

   Note that the code in the .init section has already been run.
   This includes _init and _libc_init


   At this entry point, most registers' values are unspecified, except:

   x0/w0	Contains a function pointer to be registered with `atexit'.
		This is how the dynamic linker arranges to have DT_FINI
		functions called for shared libraries that have been loaded
		before this code runs.

   sp		The stack contains the arguments and environment:
		0(sp)			argc
		8(sp)			argv[0]
		...
		(8*argc)(sp)		NULL
		(8*(argc+1))(sp)	envp[0]
		...
					NULL
 */

ubitux avatar Oct 21 '22 15:10 ubitux