decoded instruction size is bigger than essential
[1] The fully decoded instruction byte count is reported 1128 bytes on my build environment [2] This big decoded instruction size makes infeasible to fully decompile some real world EXE files into memory.,
For example, Unreal Engine 4 sample game has more than 10,000,000 instructions. If we fully decode those with Zydis and store it in memory, we end up spending 11280000000 bytes, which is more than 11 GB.
[3] The decoded instruction structure can have some weight loses. It might be a low hanging fruit. [4] Workaround
I was forced to discard decoded instructions, and then only keep Block Entry addresses and whenever I need decoded instruction info, I re-decoded the block fully.
However this caused significant dev cost and performance losses.
[5] For reference: Iced claims that they have 40 bytes of decoded instruction size https://github.com/icedland/iced/tree/master/src/rust/iced-x86
{
constexpr unsigned long total_size_of_fully_decoded_instruction = sizeof(ZydisDecodedInstruction) + sizeof(ZydisDecodedOperand) * ZYDIS_MAX_OPERAND_COUNT;
printf("total_size_of_fully_decoded_instruction: %d bytes\n", total_size_of_fully_decoded_instruction);
// prints 1128 bytes
}
It's better to copy off only the information you need into your own structure which gives you full control over what and how you store it. Also iced does not provide additional information such as regs write or read out of the box and has some extra API for this which is how its instruction structure is relatively small.
Hello, thanks for the amazing project!
[1] I believe it is reasonable to follow that structure in future versions of Zydis. It seems to be a better approach to have an external database or API structured this way - adhering to separation of responsibilities.
[2] "It's better to copy off only the information you need into your own structure which gives you full control over what and how you store it" makes sense. I actually tried that approach as well. However, I think this could motivate a revamp of this component in future versions. For example, there exists a concept of "invisible" operands, which can be considered as "auxiliary" information. This concept could be separated from operands for a cleaner structure. ZydisOperandActions is another piece of auxiliary information that is very useful, yet not always needed even for full re-encoding.
[3] Improving this would be universally beneficial to all users, so I'd like to keep this thread open! Thanks! :D
Hi @laie,
You are right that we could remove all meta-data from the ZydisDecodedInstruction/Operand structs and provide separate APIs to retrieve these, but reducing the struct sizes while still keeping a nice to use API is a complex task.
In the past, we already made some steps in this direction with the removal of ZydisDecodedOperands from the ZydisDecodedInstruction main struct.
In terms of low hanging fruits, there is really only the raw struct which we could move to ZydisDecoderContext. We just have to check a few places that are relying on some of the raw data (formatter, helper functions, etc.).
there exists a concept of "invisible" operands
You can chain these methods to decode only the instruction and all visible operands:
https://github.com/zyantific/zydis/blob/bffbb610cfea643b98e87658b9058382f7522807/include/Zydis/Decoder.h#L278-L280
https://github.com/zyantific/zydis/blob/bffbb610cfea643b98e87658b9058382f7522807/include/Zydis/Decoder.h#L308-L310
Simply set operand_count to decoded_inst.operand_count_visible.
Thanks! Didn't know I could decode visible operands only. Yes, making a long lasting public api with increasing use case demands is hard and foreseeing task.
Thank you for making this great library again!
tried this thing for reference:
[1] only using encoder-essential information [2] custom bit pack struct
//struct ZydisPackedEncoderOperand
//{
// ZydisOperandType type : ZYDIS_OPERAND_TYPE_REQUIRED_BITS;
// ZyanU8 imm_intended_maxsize; // intended size in byte
// union
// {
// struct ZydisEncoderOperandReg_
// {
// ZydisRegister value : ZYDIS_REGISTER_REQUIRED_BITS;
// ZyanBool is4 : 1;
// } reg;
// struct ZydisEncoderOperandMem_
// {
// ZydisRegister base : ZYDIS_REGISTER_REQUIRED_BITS;
// ZydisRegister index : ZYDIS_REGISTER_REQUIRED_BITS;
// ZyanU8 scale;
// ZyanI64 displacement;
// ZyanU16 size;
// } mem;
// struct ZydisEncoderOperandPtr_
// {
// ZyanU16 segment;
// ZyanU32 offset;
// } ptr;
// union ZydisEncoderOperandImm_
// {
// ZyanU64 u;
// ZyanI64 s;
// } imm;
// };
//};
//struct EncodableInstruction
//{
// ZydisMachineMode machine_mode : ZYDIS_MACHINE_MODE_REQUIRED_BITS = ZYDIS_MACHINE_MODE_LONG_64;
// ZydisEncodableEncoding allowed_encodings : ZYDIS_ENCODABLE_ENCODING_REQUIRED_BITS = ZYDIS_ENCODABLE_ENCODING_DEFAULT;
// ZydisMnemonic mnemonic : ZYDIS_MNEMONIC_REQUIRED_BITS = ZYDIS_MNEMONIC_INVALID;
// ZydisInstructionAttributes prefixes = 0;
// ZydisBranchType branch_type : ZYDIS_BRANCH_TYPE_REQUIRED_BITS;
// ZydisBranchWidth branch_width : ZYDIS_BRANCH_WIDTH_REQUIRED_BITS;
// ZydisAddressSizeHint address_size_hint : ZYDIS_ADDRESS_SIZE_HINT_REQUIRED_BITS;
// ZydisOperandSizeHint operand_size_hint : ZYDIS_OPERAND_SIZE_HINT_REQUIRED_BITS;
// ZyanU8 operand_count : ZYAN_BITS_TO_REPRESENT(ZYDIS_ENCODER_MAX_OPERANDS);
// ZydisPackedEncoderOperand operands[ZYDIS_ENCODER_MAX_OPERANDS];
// ZydisBroadcastMode evex_broadcast : ZYDIS_BROADCAST_MODE_REQUIRED_BITS;
// ZydisRoundingMode evex_rounding : ZYDIS_ROUNDING_MODE_REQUIRED_BITS;
// ZyanBool evex_sae : 1;
// ZyanBool evex_zeroing_mask : 1;
// ZydisBroadcastMode mvex_broadcast : ZYDIS_BROADCAST_MODE_REQUIRED_BITS;
// ZydisConversionMode mvex_conversion : ZYDIS_CONVERSION_MODE_REQUIRED_BITS;
// ZydisRoundingMode mvex_rounding : ZYDIS_ROUNDING_MODE_REQUIRED_BITS;
// ZydisSwizzleMode mvex_swizzle : ZYDIS_SWIZZLE_MODE_REQUIRED_BITS;
// ZyanBool mvex_sae : 1;
// ZyanBool mvex_eviction_hint : 1;
// static EncodableInstruction From(const FullyDecodedInstruction& insn);
//};
//constexpr unsigned long size = sizeof(EncodableInstruction); // = 200 which is still substantially big..
I could reduce the struct size to 200 bytes, which is quite significant improvement from 1128 bytes, but I think some architectural work is needed to make < 100 size struct.
adding #pragma pack(push, 1) #pragma pack(pop) reduces it further down to 127 bytes which is pretty good
Yes, there is definitely some room for improvement.
The discussion about packed structs, bitfields etc. as well came up a few time already over the past years.
We haven't yet made a decision here. The difficulty is with bindings for languages that do not support custom alignment.
maybe one idea: ZydisDecoderDecodedInstructionToPackedVersion I see it is harder than it looks because zydis is written in C, to provide altered APIs like ZydisDecoderDecodeFullPacked
ps) maybe just use Packed version of the struct as main, and then provide non-packed version for compatibility and other lanauges