capstone
capstone copied to clipboard
Increased size of cs_isn.bytes to 33 to be able to hold big EVM instructions
The EVM arch has a couple of really big instructions, with the biggest one being PUSH32 (with 33 bytes size). Currently cs_inst can only hold 16 bytes of raw instruction data, which is not enough for EVM. This results in the bytes being truncated to 16 bytes. This didn't really affect the disassembly or the op_str, but if you use cs_inst.size or cs_inst.bytes afterwards, it would return wrong data for those big instructions (PUSH15 up to PUSH32).
33 looks weird though, should we round the size up to some other number?
I don't know. It already feels like a lot of wasted memory for all the other archs.
Maybe it should also be surrounded by #ifdef HAS_EVM and fall back to the 16 byte size?
I don't know. It already feels like a lot of wasted memory for all the other archs. Maybe it should also be surrounded by
#ifdef HAS_EVMand fall back to the 16 byte size?
it's a good idea, can you do this continue?
masterbranch is not really used, new development should go intonextbranch (#1759)- Changing the size of a field in
cs_insnis an ABI break, meaning a change like this should only go intonextbranch (and not backported to older branches)- As a consequence, I don't think we want the memory layout of
cs_insnto change as a result of something likeHAS_EVM. Otherwise, the ABI used in alibcapstone.sowould depend on whether the library was compiled withHAS_EVM. Any headers distributed headers and and all language bindings would need to know how the library was compiled.
- As a consequence, I don't think we want the memory layout of
masterbranch is not really used, new development should go intonextbranch (unable to disassemble f3 48 0f 1e c8 (rdsspq rax) in Ubuntu 20.04 #1759)- Changing the size of a field in
cs_insnis an ABI break, meaning a change like this should only go intonextbranch (and not backported to older branches)
yes, plz rebase your branch to 'next'. 'master' is not accept to PR
- As a consequence, I don't think we want the memory layout of
cs_insnto change as a result of something likeHAS_EVM. Otherwise, the ABI used in alibcapstone.sowould depend on whether the library was compiled withHAS_EVM. Any headers distributed headers and and all language bindings would need to know how the library was compiled.
@aquynh any thoughts?
I rebased to next.
i hesitate on this PR since it would break the API :-(
cs_isn.bytes size is internal only no ? why it would break the API ?
Alternatively: Would it make sense to move the opcode constant to the cs_evm struct?
cs_isn.bytessize is internal only no ? why it would break the API ?
cs_insn is in capstone.h, which is a public header. Changing the bytes field changes the size of cs_insn which is an ABI break.
Will it make more sense to declare the field bytes to the very end of struct cs_insn as a dynamic-length array? Something like:
struct cs_insn {
// other fields..
uint32_t len_bytes; // length of "bytes"
uint8_t bytes[];
};
Or alternatively declaring bytes as a pointer just like the detail field.
The length changing of bytes was the reason that Capstone has to bump from 4 to 5 (see https://github.com/capstone-engine/capstone/issues/1315#issuecomment-458441705), maybe it's better to consider a solution instead of bumping major versions because of "minor" fixes.
The length changing of
byteswas the reason that Capstone has to bump from 4 to 5 (see #1315 (comment)), maybe it's better to consider a solution instead of bumping major versions because of "minor" fixes.
Adding an extra pointer indirection/allocation is less efficient but could help maintain ABI compatibility. It depends on what we want to do.
Thank you for the PR! I closed it because it is out of date. With the new auto-sync update for v6 we made many changes to some main architectures and will do also to others. This also changed the requirements we have now for new PRs.
If you still want to merge the changes, please rebase your fix onto the newest next branch and open a new PR.