reg_read from xzr / wzr on Arm64 / AArch64 broken in Unicorn 2.1: "Invalid argument (UC_ERR_ARG)"
Previously in Unicorn 2.0.1.post1, you could call reg_read on xzr and wzr on Arm64 / Aarch64. In Unicorn 2.1.0 / 2.1.1 you cannot.
I've bisected the code and it works in commit d7a806c026246f69a3223278e3a891b875b6f6be and fails in the next commit 4055a5ab109c1d8d2da06515f3e117ca57faf179
This points to #1835 (for #1831).
Looking at the code, I think it's because those registers aren't covered by any of the if statements and the default return value for unhandled registers was also changed from UC_ERR_OK to UC_ERR_ARG, which by itself makes sense.
Now, you may ask why do you need to read from xzr / wzr - but it didn't throw an exception before, and now it does. Code that steps through an emulation and reads the source registers of any ldr / mov instructions or similar before making a decision about whether to continue stepping further or stop will run into this bug, for example. We could special case it on the consumer side, but there's a lot of code that relied on the previous behaviour, so this is an unexpected compatibility change which would require lots of special handling if not fixed.
In my opinion these 2 registers should have their own if / case statement and be handled appropriately by returning 0 (width-appropriate depending on which register was requested).
Minimal testcase based on the existing sample_arm64.py code for the binding:
#!/usr/bin/env python3
# Sample code for ARM64 of Unicorn. Nguyen Anh Quynh <[email protected]>
# Python sample ported by Loi Anh Tuan <[email protected]>
from unicorn import *
from unicorn.arm64_const import *
# code to be emulated
ARM64_CODE = b"\xEF\x03\x1F\xAA" # mov x15, xzr
# memory address where emulation starts
ADDRESS = 0x10000
# Test ARM64 reading from xzr register
def test_arm64():
try:
# Initialize ARM64 emulator in ARM mode
mu = Uc(UC_ARCH_ARM64, UC_MODE_ARM)
# map 4kB memory for this emulation
mu.mem_map(ADDRESS, 4 * 1024)
# write machine code to be emulated to memory
mu.mem_write(ADDRESS, ARM64_CODE)
# initialize machine register
mu.reg_write(UC_ARM64_REG_X15, 0x4141414141414141)
# emulate machine code in infinite time
mu.emu_start(ADDRESS, ADDRESS + len(ARM64_CODE))
print("Emulation done.\n")
print("UC_ARM64_REG_X15 const = %u" % UC_ARM64_REG_X15)
x15 = mu.reg_read(UC_ARM64_REG_X15)
print("x15 = 0x%x\n" % x15)
print("UC_ARM64_REG_XZR const = %u" % UC_ARM64_REG_XZR)
xzr_value = mu.reg_read(UC_ARM64_REG_XZR)
print("xzr = 0x%x" % xzr_value)
except UcError as e:
print("ERROR: %s" % e)
if __name__ == '__main__':
test_arm64()
This used to return:
Emulation done.
UC_ARM64_REG_X15 const = 214
x15 = 0x0
UC_ARM64_REG_XZR const = 7
xzr = 0x0
But now returns:
Emulation done.
UC_ARM64_REG_X15 const = 214
x15 = 0x0
UC_ARM64_REG_XZR const = 7
ERROR: Invalid argument (UC_ERR_ARG)
And the same is true for wzr (const = 6).
Thanks for your hard work, hopefully that makes sense.
I got the similar feedbacks of this behavior change from various sources, which I admit I should have highlight in the changelog.
Generally, I’m thinking of printing a warning to stderr (probably in the incoming 2.1.2) and changing this to an error in next minor version like 2.2.0.
Print a warning now in 9f935f505ecdc93c195d6cb05d47745a514ce175 and fix this in 05e29b4507c06655e815f6cc970c9f2eb98f8ead
That fix looks like it will work for us, thanks.
Looking forward to a point release with that fix included.
Hi, its almost two months after this fix has been applied. Any chance we can get it released? Otherwise, we have to due an if/else internally as we did in pwndbg/pwndbg#2548 ;).
Hi, its almost two months after this fix has been applied. Any chance we can get it released? Otherwise, we have to due an if/else internally as we did in pwndbg/pwndbg#2548 ;).
Thanks for reaching out. The new release is almost ready but I'm not feeling very well recently and thus I can't assure you a release date unfortunately. Personally, I will try to release a new point release (2.1.2) before mid Dec.
Closing as 2.1.2 is out.