manticore icon indicating copy to clipboard operation
manticore copied to clipboard

AArch64 UMOV instruction fails with Aarch64InvalidInstruction when VAS is ARM64_VAS_INVALID

Open dguido opened this issue 4 months ago • 0 comments

AArch64 UMOV instruction fails with Aarch64InvalidInstruction when VAS is ARM64_VAS_INVALID

Summary

The AArch64 UMOV instruction handler raises Aarch64InvalidInstruction when processing certain valid UMOV/MOV instructions. This occurs when Capstone decodes instructions like umov w0, v1.s[0] or mov w0, v1.s[0] (MOV is an alias for UMOV) and returns a Vector Access Specifier (VAS) value of ARM64_VAS_INVALID (0x0).

Environment

  • Manticore version: Latest (from ekilmer/use-pyproject-toml branch)
  • Python version: 3.11
  • OS: Linux
  • Architecture: Testing AArch64 emulation
  • Dependencies:
    • Capstone: 5.0.0
    • Keystone: 0.9.2
    • Unicorn: 2.1.3

Steps to Reproduce

Run the AArch64 CPU tests:

uv run pytest tests/native/test_aarch64cpu.py::Aarch64CpuInstructions::test_umov -xvs
uv run pytest tests/native/test_aarch64cpu.py::Aarch64CpuInstructions::test_mov_to_general -xvs

Actual Behavior

Tests fail with Aarch64InvalidInstruction exception:

FAILED tests/native/test_aarch64cpu.py::Aarch64CpuInstructions::test_umov - manticore.native.cpu.aarch64.Aarch64InvalidInstruction
FAILED tests/native/test_aarch64cpu.py::Aarch64CpuInstructions::test_mov_to_general - manticore.native.cpu.aarch64.Aarch64InvalidInstruction

The same failure occurs for the symbolic instruction tests:

  • Aarch64SymInstructions::test_umov
  • Aarch64SymInstructions::test_mov_to_general

Expected Behavior

Tests should pass. The UMOV instruction handler should correctly process instructions even when Capstone returns ARM64_VAS_INVALID as the vector access specifier.

Root Cause Analysis

The Problem

The UMOV instruction handler in /root/manticore/manticore/native/cpu/aarch64.py (lines 5131-5145) only handles specific VAS values:

if vas == cs.arm64.ARM64_VAS_1B:    # value = 4
    elem_size = 8
elif vas == cs.arm64.ARM64_VAS_1H:  # value = 8
    elem_size = 16
elif vas == cs.arm64.ARM64_VAS_1S:  # value = 11
    elem_size = 32
elif vas == cs.arm64.ARM64_VAS_1D:  # value = 13
    elem_size = 64
else:
    raise Aarch64InvalidInstruction  # Line 5145

However, when Capstone decodes certain UMOV instructions, particularly:

  • umov w0, v1.s[0]
  • umov x0, v1.d[0]
  • mov w0, v1.s[0] (MOV alias for UMOV)
  • mov x0, v1.d[0] (MOV alias for UMOV)

It returns ARM64_VAS_INVALID (value = 0) as the VAS value. Since the handler doesn't have a case for VAS=0, it raises the exception.

Verification

When assembling and disassembling these instructions:

Instruction Keystone Encoding Capstone Decode VAS Value Result
umov w0, v1.b[0] 203c010e umov w0, v1.b[0] 4 (ARM64_VAS_1B) ✓ Works
umov w0, v1.h[0] 203c020e umov w0, v1.h[0] 8 (ARM64_VAS_1H) ✓ Works
umov w0, v1.s[0] 203c040e mov w0, v1.s[0] 0 (ARM64_VAS_INVALID) ✗ Fails
umov x0, v1.d[0] 203c084e mov x0, v1.d[0] 0 (ARM64_VAS_INVALID) ✗ Fails

Note how Capstone decodes the last two as mov (the alias form) and returns VAS_INVALID.

Proposed Fix

Add handling for ARM64_VAS_INVALID in the UMOV handler. When VAS is 0, the element size needs to be inferred from the instruction operands:

if vas == cs.arm64.ARM64_VAS_INVALID:  # value = 0
    # Handle MOV alias form - infer element size from operand
    # This commonly happens with .s and .d element specifiers
    op_str = insn.op_str
    if '.b[' in op_str:
        elem_size = 8
    elif '.h[' in op_str:
        elem_size = 16
    elif '.s[' in op_str:
        elem_size = 32
    elif '.d[' in op_str:
        elem_size = 64
    else:
        raise Aarch64InvalidInstruction
elif vas == cs.arm64.ARM64_VAS_1B:
    elem_size = 8
elif vas == cs.arm64.ARM64_VAS_1H:
    elem_size = 16
elif vas == cs.arm64.ARM64_VAS_1S:
    elem_size = 32
elif vas == cs.arm64.ARM64_VAS_1D:
    elem_size = 64
else:
    raise Aarch64InvalidInstruction

Alternatively, the element size could be inferred from the destination register size and instruction encoding.

Impact

  • 4 test methods fail completely (2 in Aarch64CpuInstructions, 2 in Aarch64SymInstructions)
  • Each test method contains multiple test cases, affecting coverage of UMOV/MOV instructions
  • This blocks proper testing of vector-to-general register moves for 32-bit and 64-bit elements

Additional Notes

  • The issue only affects certain forms of UMOV that Capstone decodes as MOV aliases
  • The byte and halfword variants (umov w0, v1.b[i], umov w0, v1.h[i]) work correctly
  • This appears to be a quirk in how Capstone handles the MOV alias form of UMOV instructions

dguido avatar Aug 22 '25 16:08 dguido