keystone icon indicating copy to clipboard operation
keystone copied to clipboard

ARM64 instructions with PC-relative offset

Open noword opened this issue 8 years ago • 9 comments

kstool.exe arm64 "adrp x0, #0" 0x123456000 adrp x0, #0 = [ 00 00 00 90 ] cstool.exe arm64 00000090 0x123456000 123456000 00000090 adrp x0, #0x123456000

There are similar problems with in b.cond/cbz/cbnz/tbz/tbnz.

I will push a request.

noword avatar Mar 22 '17 09:03 noword

can you please provide sample for all the other instructions b.cond/cbz/cbnz/tbz/tbnz ?

aquynh avatar Mar 23 '17 05:03 aquynh

The correct results:

kstool.exe arm64 "beq #0x123456780" 0x123456780 beq #0x123456780 = [ 00 00 00 54 ]

kstool.exe arm64 "cbz x0, #0x123456780" 0x123456780 cbz x0, #0x123456780 = [ 00 00 00 b4 ]

kstool.exe arm64 "cbnz x0, #0x123456780" 0x123456780 cbnz x0, #0x123456780 = [ 00 00 00 b5 ]

kstool.exe arm64 "tbz x0, #1, #0x123456780" 0x123456780 tbz x0, #1, #0x123456780 = [ 00 00 08 36 ]

kstool.exe arm64 "tbnz x0, #1, #0x123456780" 0x123456780 tbnz x0, #1, #0x123456780 = [ 00 00 08 37 ]

noword avatar Mar 23 '17 08:03 noword

tbnz x0, #1, #0x123456780 = [ 00 00 08 37 ]

actually cstool decodes this as (x0 vs w0):

$ cstool arm64 00000837 0x123456780
123456780  00000837  tbnz       w0, #1, #0x123456780

aquynh avatar Mar 23 '17 11:03 aquynh

so on the last 2 samples, the output of Keystone (with your patch) is decoded differently by Capstone?

on the 1st sample, Keystone also fails (with your patch applied):

$ kstool arm64 "adrp x0, #0" 0x123456000
ERROR: failed on ks_asm() with count = 0, error = 'Unknow error' (code = 539)

but without address as the last param, kstool can compile:

$ kstool arm64 "adrp x0, #0"
adrp x0, #0 = [ 00 00 00 90 ]

aquynh avatar Mar 23 '17 11:03 aquynh

tbnz w0, #1, #0x123456780 tbnz x0, #1, #0x123456780

I think these two are equivalent. But I'm not very sure about this.

According to the declare of adrp

I think that "adrp x0, #0" at address 0x123456000 is an illegal instruction.

noword avatar Mar 24 '17 01:03 noword

@aquynh I don't think this is a issue.

Your old code is right and this commit 51ea7aff4ecf3a605ea06a1a42c38acc1bb22e2b is definitely wrong. I think before we have doubt for the assemble result, please also go to ODA to check the result first.

I have 1 questions here.

  1. This commit didn't consider adr instruction. it's patched for adrp. So the assembly grammar is inconsistent for adr and adrp.

ARMv8 instruction set do not provide any standard for adr[p] reg, #immediate. So if the second operand is a constant, it depends on what we want to provide here. We have two choices:

  1. provide an offset immediate here. it's actually not user friendly, but it can be filled into machine code directly and don't need any more calculation in assembler.
  2. provide an address immediate here. In this case, assembler should do some calculations and get the offset first.

for me, I prefer the second way. And from old code, i think keystone want user to provide a offset here.

    bool isAdrpLabel() const {
    // Validation was handled during parsing, so we just sanity check that
    // something didn't go haywire.
    if (!isImm())
        return false;

    if (const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(Imm.Val)) {
      int64_t Val = CE->getValue() - Ctx.getBaseAddress();   // here we want address, this is commit 51ea7aff4ecf3a605ea06a1a42c38acc1bb22e2b
      int64_t Min = - (4096 * (1LL << (21 - 1)));
      int64_t Max = 4096 * ((1LL << (21 - 1)) - 1);
      return (Val % 4096) == 0 && Val >= Min && Val <= Max;
    }

    return true;
  }

  bool isAdrLabel() const {
    // Validation was handled during parsing, so we just sanity check that
    // something didn't go haywire.
    if (!isImm())
        return false;

    if (const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(Imm.Val)) {
      int64_t Val = CE->getValue();           // here we want offset. inconsistent with adrp, which one?
      int64_t Min = - (1LL << (21 - 1));
      int64_t Max = ((1LL << (21 - 1)) - 1);
      return Val >= Min && Val <= Max;
    }

    return true;
  }

Or maybe you can make it consistent with ARM adr

xiaohuajiao avatar Apr 20 '17 02:04 xiaohuajiao

any news? this issue is still here....

yufengzjj avatar Sep 12 '20 08:09 yufengzjj

any news? this issue is still here....

rbs-alexr avatar Apr 28 '23 18:04 rbs-alexr

any news? this issue is still here....

I'm surprised this still isn't fixed. Any alternative libraries or non-commited fixes ? Just like the guy above I can only assemble instructions with a smaller address, but I have no choice to go smaller since this address is the virtual address of the string table from the binary.:

adrp x8, #0x1004AA000

DebianArch64 avatar Sep 13 '23 18:09 DebianArch64