Add access support for RISC-V
Your checklist for this pull request
- [x] I've documented or updated the documentation of every API function and struct this PR changes.
- [x] I've added tests that prove my fix is effective or that my feature works (if possible)
Detailed description
As indicated in PR #2003, RISC-V registers and memory operands had no read/write information. In other words, there were no access fields in cs_riscv_op. This PR added read/write information support for risc-v operands, aka (cs_riscv_op.access). This PR also fixed some RISC-V C (compressed) instruction bugs on the way.
Operand read/write details are located in arch/RISCV/RISCVMappingInsnOp.inc. Since the current LLVM support for RISC-V is not ideal, and at the mean time the RISC-V instruction set is relatively small, instruction operand details were manually added to arch/RISCV/RISCVMappingInsnOp.inc.
[!NOTE] I was recently acknowledged that the RISC-V generator will be switched to a SAIL-based one. Therefore, instruction details in
arch/RISCV/RISCVMappingInsnOp.incis merely a temporary solution. This file may be replaced once the SAIL generator is implemented.
Instruction details are added in RISCV_add_cs_detail and invoked in printOperand. Memory address and offset adjustments were rewritten in fixDetailOfEffectiveAddr to support RISC-V C (compressed) instructions. RISCV_get_detail_op, RISCV_inc_detail_op_count, RISCV_dec_detail_op_count, and RISCV_get_arch_detail are added to simplify the code.
A new function markCLSInsn marks RISC-V C (compressed) load/store instructions. Similar to markLSInsn, markCLSInsn is invoked in RISCVDisassembler_getInstruction.
Other bug fixes / new features include
- Fix a RISC-V C (compressed) instruction bug (#2351). That is,
cstool -d riscv64 88c5now returns the correct operand type:
0 88 c5 c.sw a0, 8(a1)
ID: 131 (c.sw)
op_count: 2
operands[0].type: REG = a0
operands[0].access: READ
operands[1].type: MEM
operands[1].mem.base: REG = a1
operands[1].mem.disp: 0x8
operands[1].access: WRITE
Groups: hasStdExtC
- Memory base & displacement are supported now for load/store RISC-V C (compressed) instructions. For example,
cstool -d riscv64 8865now should return
0 88 65 c.ld a0, 8(a1)
ID: 115 (c.ld)
op_count: 2
operands[0].type: REG = a0
operands[0].access: WRITE
operands[1].type: MEM
operands[1].mem.base: REG = a1
operands[1].mem.disp: 0x8
operands[1].access: READ
Groups: hasStdExtC isrv64
Instead of something like
0 88 65 c.ld a0, 8(a1)
ID: 115 (c.ld)
op_count: 2
operands[0].type: REG = a0
operands[1].type: IMM = 8
operands[2].type: REG = a1
Groups: hasStdExtC isrv64
...
Test plan
Test for cs_riscv_op->access is added in tests/test_riscv.c. After running tests/test_riscv, you should see operands[i].access: READ / WRITE.
More instructions can be teseted through cstool.
...
Closing issues
"closes #2003" "closes #2351"
...
Also, I would advice you to build with:
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCAPSTONE_BUILD_CSTEST=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=1 -DENABLE_ASAN=1
With ASAN and building cstest you get the most coverage
Very nice! Thanks a lot!
How did you generate the
InsnOp.inctable? Did you use the legacy scripts or Auto-Sync? In the last case: are there any changes you made which we could merge as well?Also, please see my reply in the Telegram community channel.
Hi:
-
So the instruction details in
RISCVMappingInsnOp.incis written by hand. I did not use any legacy scripts or auto-sync (and that's part of the reason I use the nameRISCVMappingInsnOp.incnotRISCVGenCSMappingInsnOp.inc). In my opinion, the generator may not deal with some subtleties in RISC-V. For example, RISC-V compression instructions are handled inconsistently (e.g.,c.additakes 2 operands, butRISCVGenDisassemblerTables.incwants 3), which makes manual editing unavoidable. -
I've just saw your reply on Telegram. Thanks for the detailed reply! Although it may take years to add SAILS support, the current Capstone RISC-V disassembler works like charm. There may be some edge cases, but overall it works pretty well. We'll be working on fixing these bugs and contributing to Capstone while developing my project.
Also, I would advice you to build with:
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCAPSTONE_BUILD_CSTEST=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=1 -DENABLE_ASAN=1With ASAN and building cstest you get the most coverage
Thanks for the command! I will try to build it with the above command, fix some bugs, and make the checks work.
We'll be working on fixing these bugs and contributing to Capstone while developing my project.
Very much appreciate this! Especially since the v6 release is coming up, this is a really nice addition.
We have to trigger the CI every time when you push (because we want to stay within the CI ranges, which are free). So I'd suggest you try to build everything locally first. I'll be online for the next few hours and trigger it when I see it, but probably forget it sometimes.
We have to trigger the CI every time when you push (because we want to stay within the CI ranges, which are free). So I'd suggest you try to build everything locally first. I'll be online for the next few hours and trigger it when I see it, but probably forget it sometimes.
All right. I can build it locally now.
New change:
- Fix incorrect operands for the csr instruction and the cs.w (and its variants) instruction.
- Add Python binding
- Add new test cases for M,A,F,D,C instructions
=================================================================
==62==ERROR: AddressSanitizer: ABRT on unknown address 0x00000000003e (pc 0x7f882003500b bp 0x7f88201aa588 sp 0x7ffebc63ac00 T0)
SCARINESS: 10 (signal)
#0 0x7f882003500b in raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
#1 0x7f8820014858 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x22858) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
#2 0x7f8820014728 (/lib/x86_64-linux-gnu/libc.so.6+0x22728) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
#3 0x7f8820025fd5 in __assert_fail (/lib/x86_64-linux-gnu/libc.so.6+0x33fd5) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
#4 0x562e678fd14b in RISCV_add_cs_detail /src/capstonenext/arch/RISCV/RISCVMapping.c:171:3
#5 0x562e678fba44 in printOperand /src/capstonenext/arch/RISCV/RISCVInstPrinter.c:274:4
#6 0x562e678f2a82 in printInstruction /src/capstonenext/arch/RISCV/RISCVGenAsmWriter.inc:1208:5
#7 0x562e678f2a82 in RISCV_printInst /src/capstonenext/arch/RISCV/RISCVInstPrinter.c:250:7
#8 0x562e677b51fd in cs_disasm /src/capstonenext/cs.c:1241:4
#9 0x562e677b2d41 in LLVMFuzzerTestOneInput /src/capstonenext/suite/fuzz/fuzz_disasm.c:56:20
#10 0x562e67665470 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:614:13
#11 0x562e67664c95 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:516:7
#12 0x562e67666c22 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:829:7
#13 0x562e67666f57 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:867:3
#14 0x562e67655566 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:914:6
#15 0x562e67681a92 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
#16 0x7f8820016082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
#17 0x562e676466dd in _start (build-out/fuzz_disasmnext+0x5276dd)
DEDUP_TOKEN: raise--abort--
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: ABRT (/lib/x86_64-linux-gnu/libc.so.6+0x4300b) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e) in raise
==62==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000
0x2d,0x73,0x0,0x15,0x13,0x0,0xd3,0x0,0x0,0xa6,0xef,
-s\000\025\023\000\323\000\000\246\357
This error is caused by an insufficient number of operands in sfence.vma.
Similar bug occurs in fence.
Both should be fixed now.
Hi,
May I ask what is the purpose of test/cs_detail/issue.cs?
What test cases should I add to it?
Thank you!
It tests the details. In your case the READ/WRITE primitives.
The reason I ask you is because I currently modernize testing and will translate those tests to yaml. The tests in test_ARCH.... as well at some point. But they are more annoying to copy.
Also, the tests you added only ensure decoding. Not that the decoded values are correct. If it prints something wrong, the CI doesn't care.
Added tests intest/cs_detail/issue.cs
Also rebased to next
@kabeor @aquynh please take a look.
@moste00 FYI
It looks nice, but plz solve the conflict file:)
@kabor Resolved the conflict