rizin
rizin copied to clipboard
Autodetect `asm.cpu` whenever possible
It is common to have ELF for ARM Cortex-M profile but it's not shown in the ELF header:
iorw false
block 0x100
type EXEC (Executable file)
arch arm
cpu N/A
baddr 0x342a0000
binsz 0x00d01f0b
bintype elf
bits 32
class ELF32
compiler GCC:
dbg_file N/A
endian LE
hdr.csum N/A
guid N/A
intrp N/A
laddr 0x00000000
lang c++
machine ARM
maxopsz 4
minopsz 2
os linux
cc N/A
pcalign 2
rpath NONE
But the CPU profile can affect analysis drastically in the case of ARM Cortex-M, for example, because of additional instructions, and being Thumb, it has some effect on the sequence of disassembly.
We should figure out a way to detect Cortex-M ELFs whenever possible. Currently you have to specify it from command line:
$ rizin -A -e asm.cpu=cortexm firmware.elf
Would be nice to autodetect cortexm/cortexa profiles whenever possible.
Quite often compilers add a special section .ARM.attributes
that has that information (note the Tag_CPU_arch_profile
and Tag_CPU_arch
attributes):
> readelf -A cortex-a8.out
Attribute Section: aeabi
File Attributes
Tag_conformance: "2.10"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_PCS_config: Bare platform
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: small
Tag_ABI_VFP_args: compatible
Tag_CPU_unaligned_access: v6
Tag_DIV_use: Not allowed
> readelf -A cortex-m33.out
Attribute Section: aeabi
File Attributes
Tag_conformance: "2.10"
Tag_CPU_arch: v8-M.mainline
Tag_CPU_arch_profile: Microcontroller
Tag_THUMB_ISA_use: Yes
Tag_FP_arch: FPv5/FP-D16 for ARMv8
Tag_PCS_config: Bare platform
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: forced to int
Tag_ABI_HardFP_use: SP only
Tag_ABI_VFP_args: compatible
Tag_CPU_unaligned_access: v6
Tag_DIV_use: Not allowed
See https://stackoverflow.com/questions/70071681/how-can-i-know-if-an-elf-file-is-for-cortex-a-or-cortex-m for more information
It should be changed somewhere probably in librz/bin/format/elf/
.
See file librz/bin/format/elf/elf_info.c
and get_cpu_mips()
function as an example.
Hi. I would like to work on this issue. I think I have got an idea on how to resolve this.
Quite often compilers add a special section
.ARM.attributes
that has that information (note theTag_CPU_arch_profile
andTag_CPU_arch
attributes)
Hi. Just to be clear, is our intention to simply recognize the cpu profile (eg: A, M, R, etc) or the specific processor family (eg: cortex, neoverse, etc.) that the elf is expected to run on?
Based on what I have understood after reading through ARM's addenda to their ABI and this wikipedia page on the list of ARM processors, it's quite clear that the "M" profile implies the cortex-m processor family or a similar family (like SecurCore) which shares the same features.
However, the "A" cpu profile could imply the cortex-a family or the neoverse family.
I noticed the following struct in librz/asm/p/asm_arm_cs.c
:
RzAsmPlugin rz_asm_plugin_arm_cs = {
.name = "arm",
.desc = "Capstone ARM disassembler",
.cpus = "v8,cortexm,arm1176,cortexA72,cortexA8",
.platforms = "bcm2835,omap3430",
.features = "v8",
.license = "BSD",
.arch = "arm",
.bits = 16 | 32 | 64,
.endian = RZ_SYS_ENDIAN_LITTLE | RZ_SYS_ENDIAN_BIG,
.disassemble = &disassemble,
...
}
The cpus
field is hard coded to a specific processor (eg: cortexA8) or a family (eg: cortexm). How do I go about dealing with other families such as Neoverse
?
@valdaarhun for now detecting profile is enough, but since Rizin ARM decoding is based on Capstone, only those make sense for autodetection (https://github.com/capstone-engine/capstone/blob/next/include/capstone/arm.h#L1638):
// Architecture-specific groups
// generated content <ARMGenCSFeatureEnum.inc> begin
// clang-format off
ARM_FEATURE_IsARM = 128,
ARM_FEATURE_HasV5T,
ARM_FEATURE_HasV4T,
ARM_FEATURE_HasVFP2,
ARM_FEATURE_HasV5TE,
ARM_FEATURE_HasV6T2,
ARM_FEATURE_HasMVEInt,
ARM_FEATURE_HasNEON,
ARM_FEATURE_HasFPRegs64,
ARM_FEATURE_HasFPRegs,
ARM_FEATURE_IsThumb2,
ARM_FEATURE_HasV8_1MMainline,
ARM_FEATURE_HasLOB,
ARM_FEATURE_IsThumb,
ARM_FEATURE_HasV8MBaseline,
ARM_FEATURE_Has8MSecExt,
ARM_FEATURE_HasV8,
ARM_FEATURE_HasAES,
ARM_FEATURE_HasBF16,
ARM_FEATURE_HasCDE,
ARM_FEATURE_PreV8,
ARM_FEATURE_HasV6K,
ARM_FEATURE_HasCRC,
ARM_FEATURE_HasV7,
ARM_FEATURE_HasDB,
ARM_FEATURE_HasVirtualization,
ARM_FEATURE_HasVFP3,
ARM_FEATURE_HasDPVFP,
ARM_FEATURE_HasFullFP16,
ARM_FEATURE_HasV6,
ARM_FEATURE_HasAcquireRelease,
ARM_FEATURE_HasV7Clrex,
ARM_FEATURE_HasMVEFloat,
ARM_FEATURE_HasFPRegsV8_1M,
ARM_FEATURE_HasMP,
ARM_FEATURE_HasSB,
ARM_FEATURE_HasDivideInARM,
ARM_FEATURE_HasV8_1a,
ARM_FEATURE_HasSHA2,
ARM_FEATURE_HasTrustZone,
ARM_FEATURE_UseNaClTrap,
ARM_FEATURE_HasV8_4a,
ARM_FEATURE_HasV8_3a,
ARM_FEATURE_HasFPARMv8,
ARM_FEATURE_HasFP16,
ARM_FEATURE_HasVFP4,
ARM_FEATURE_HasFP16FML,
ARM_FEATURE_HasFPRegs16,
ARM_FEATURE_HasV8MMainline,
ARM_FEATURE_HasDotProd,
ARM_FEATURE_HasMatMulInt8,
ARM_FEATURE_IsMClass,
ARM_FEATURE_HasPACBTI,
ARM_FEATURE_IsNotMClass,
ARM_FEATURE_HasDSP,
ARM_FEATURE_HasDivideInThumb,
ARM_FEATURE_HasV6M,
As rizin doesn't have a way to select particular features, only CPUs with sets of particular features are possible for now.
cc @Rot127
@valdaarhun if you check disasssemble()
function in the librz/asm/p/asm_arm_cs.
you will see that only CS_MODE_MCLASS
and CS_MODE_V8
are used. Thus, it's fine to detect just those for now.
I see. In that case, I'll just focus on these two classes.
Hi. The functions get_cpu_mips
or get_cpu_arm
in librz/bin/format/elf/elf_info.c
simply print the cpu name. How do I get rizin to actually make sense of it before disassembly?
In librz/arch/p/asm_arm_cs:disassemble()
, it checks the value of a->cpu
. I am guessing it needs to figure out a way to set a->cpu
to "cortexm" or "v8". But where is this actually set?
When rizin is run with -e asm.cpu=cortexm
, it calls rz_config_eval()
. I think this sets the value in r->config
. Should I use the same/similar approach in get_cpu_arm()
?
Hmm, I thought this value is used somewhere, my bad. Ok, you need to pass it to the config somehow, yes. It's probably should be done somewhere in librz/core/cbin.c
Thank you for your response. I'll take a look at cbin.c
.
@valdaarhun Sorry, I missed the mention above from @XVilka. It's fine, if for now it can only check for armv8
or the M-profile. Although, please ensure it is easily extendible. So when we add toggles for all the other CPU features (e.g. see list above), it takes only minimal effort.
In the best case implement your solution only for armv8
and add coretx-m
toggle afterwards. So you can check if it is actually easy to add a feature.