binaryninja-api
binaryninja-api copied to clipboard
Cannot change constants to pointers in the code (using O)
Version and Platform (required):
- Binary Ninja Version: 3.5.4286-dev, 58311b01
- OS: macos
- OS Version: 13.3
- CPU Architecture: arm64
I would like to change 0x2004a7a8
, 0x2004a698
, 0x2004a69c
to pointers but I can't.
Note also that 0x2004a69c
is a pointer at 0x1348a
.
00013428 int32_t sub_13428(int32_t arg1, int32_t arg2, int32_t* arg3)
🚫00013434 unimplemented {vpush {d8, d9}}
0001343c int32_t r0_2
0001343c void* r6_1
0001343c if (arg2 u<= 1 && arg3 != 0)
0001344c r6_1 = arg2 * 0x208
00013456 r0_2 = (*(sub_6cdd4() + 0x80))(r6_1 + 0x2004a7a8)
00013458 if (r0_2 != 0)
00013468 (*(setup_log() + 0xc))(0, "[ASSERT]%s @ %s @ %d", "get_fmu_dm()->lock_mutex(&barome…", "write_data_to_barometer", 0x386)
00013458 int32_t r4_2
00013458 if ((arg2 u<= 1 && arg3 != 0 && r0_2 != 0) || arg2 u> 1 || (arg2 u<= 1 && arg3 == 0))
00013470 r4_2 = 0xffffffff
00013458 if (arg2 u<= 1 && arg3 != 0 && r0_2 == 0)
0001347a if (arg1 != 0x20)
0001370c r4_2 = 0xffffffff
00013482 else
00013482 int32_t r1 = arg3[1]
00013482 int32_t r2 = arg3[2]
0001348a *(r6_1 + 0x2004a698) = *arg3
0001348a *(r6_1 + &data_2004a69c) = r1
0001348a *(r6_1 + &data_2004a6a0) = r2
0001349c int128_t q0_1
0001349c int128_t q1_1
0001349c int64_t q2_1
0001349c q0_1, q1_1, q2_1 = sub_7a76c(arg2, r6_1 + 0x2004a698, r6_1 + 0x2004a69c)
000134a0 uint32_t r3_3 = zx.d(*(r6_1 + &data_2004a855))
My first guess was to type r6_1
as an int pointer. This results in weird array references based on r6_1
, though.
🚫00013434 unimplemented {vpush {d8, d9}}
0001343c int32_t r0_2
0001343c int32_t* r6_1
0001343c if (arg2 u<= 1 && arg3 != 0)
0001344c r6_1 = arg2 * 0x208
00013456 r0_2 = (*(sub_6cdd4() + 0x80))(&r6_1[0x80129ea])
00013458 if (r0_2 != 0)
00013468 (*(setup_log() + 0xc))(0, "[ASSERT]%s @ %s @ %d", "get_fmu_dm()->lock_mutex(&barome…", "write_data_to_barometer", 0x386)
00013458 int32_t r4_2
00013458 if ((arg2 u<= 1 && arg3 != 0 && r0_2 != 0) || arg2 u> 1 || (arg2 u<= 1 && arg3 == 0))
00013470 r4_2 = 0xffffffff
00013458 if (arg2 u<= 1 && arg3 != 0 && r0_2 == 0)
0001347a if (arg1 != 0x20)
0001370c r4_2 = 0xffffffff
00013482 else
00013482 int32_t r1 = arg3[1]
00013482 int32_t r2 = arg3[2]
0001347e r6_1[0x80129a6] = *arg3
0001347e r6_1[0x80129a7] = r1
0001347e r6_1[0x80129a8] = r2
0001349c int128_t q0_1
0001349c int128_t q1_1
0001349c int64_t q2_1
0001349c q0_1, q1_1, q2_1 = sub_7a76c(arg2, &r6_1[0x80129a6], &r6_1[0x80129a7])
Same issue here
000137d0 if (zx.d(*((r2_1 << 3) + 0xf49d8)) == zx.d(var_7c))
and here where not all the values in the 0x200xxxxx
range can be changed to pointers
00014ce8 void* sub_14ce8(uint32_t arg1, int32_t arg2, int32_t arg3, int16_t arg4, int64_t arg5 @ q0)
e.g.
00014d44 if (zx.d(*(0x438 * r6 + 0x2004b596)) << 0x1f s< 0)
Anyone from V35 should search for "Encourage Salesman Prompt Delay" to find the database.
Also, see the arbitrary treatment in sub_17d38
below. All the vars are integers.
00018358 *(r5_1 + 0x200500a0) = var_128
00018358 int32_t var_124
00018358 *(r5_1 + &data_200500a4) = var_124
00018358 *(r5_1 + &data_200500a8) = var_120
00018358 *(r5_1 + &data_200500ac) = var_11c
00018360 *(r5_1 + 0x200500b0) = var_118
00018360 *(r5_1 + &data_200500b4) = var_114
00018360 *(r5_1 + &data_200500b8) = var_110
00018360 *(r5_1 + &data_200500bc) = var_10c
00018368 *(r5_1 + 0x200500c0) = var_108
00018368 int32_t var_104
00018368 *(r5_1 + &data_200500c4) = var_104
00018368 *(r5_1 + &data_200500c8) = var_100
00018368 int32_t var_fc
00018368 *(r5_1 + &data_200500cc) = var_fc
00018370 *(r5_1 + 0x200500d0) = var_f8
also a whole bunch of other places in the same function.
I've been trying to track this down, and there are a few things I'd like to document at this point:
- The
o
key does not make something a pointer. It makes the display type a pointer. It's the equivalent of right-clicking and saying Display As -> Pointer, but as a hotkey. - When you hit the
o
key on these values, and they aren't already a data variable, it will create avoid
one for you. This is accurate and what we expect to happen. - If you are not in HLIL, this also changes the pointer display type correctly and displays you the thing you expect.
- If, however, you are in HLIL, it may work, or may not, and this is actually "the bug".
I don't actually know what "the bug" is, but here are my observations:
- If you hit
o
on 0x1348a in HLIL, it goes into the map of integer display types (inBNSetIntegerConstantDisplayType
) with that address. - This does not change the HLIL display.
- If you switch to MLIL and hit
o
on 0x1347e, it goes into the map of integer display types with that address. - This does change the HLIL display.
Also, consider the following:
>>> current_function.get_int_display_type(0x1348a, 0x2004a698, 0xffffffff)
<IntegerDisplayType.DefaultIntegerDisplayType: 0>
>>> current_function.get_int_display_type(0x1347e, 0x2004a698, 0xffffffff)
<IntegerDisplayType.DefaultIntegerDisplayType: 0>
>>> current_function.get_int_display_type(0x1347e, 0x2004a658, 0xffffffff)
<IntegerDisplayType.PointerDisplayType: 9>
>>> current_function.get_int_display_type(0x1348a, 0x2004a658, 0xffffffff)
<IntegerDisplayType.DefaultIntegerDisplayType: 0>
I thought "the bug" was that, when HLIL is determining how to display this value, it looks up the address and value of the thing it came from in MLIL, and uses that display type. But, when you hit o
and try to set this, it uses the address from HLIL instead of the one it will actually look up. But, the problem with that theory is that I would have expected the last line above to also show PointerDisplayType
, and it does not.
So, in conclusion: There is a mismatch between what goes into the integer display type map, and what we look up in that map when displaying values in HLIL. This sometimes works out, and sometimes doesn't, and is likely the root cause of a whole host of HLIL display issues. I just don't entirely understand where this mismatch is occurring or how to fix it.
(Note: The reason HLIL shows 98
and MLIL shows 58
is because of the addition of 0x40 at 0x13486. So, although these two locations appear to be unrelated, HLIL is showing the calculated value of [tmp0_2].d
.)
I have a potential fix for this but its kinda wide reaching and needs a bunch of testing to be sure. Essentially we need a case within our MLIL translator that is able to coerce constants to pointers if they point to Functions (or potentially DataVariables) Though it is important that we ensure that the constant not be a simple intermediate, as this can cause problems in RISK like architectures where you have a load/store that requires multiple instructions to accomplish.
Also, see #4336