ghidra
ghidra copied to clipboard
Data structures allowing variable sized member arrays
Is your feature request related to a problem? Please describe.
When structures can be defined as having their (last) element(s) as a variable size array(s), as in the simple case of the struct type_info used to represent the C++ class, assigning those types to memory in the data segment is incomplete as invariably the dataype manager only allows for a static definition of your type. Illustrating this using the type_info, the structure is defined as:
struct type_info {
type_info_vtable _m_vtable;
void * _m_data;
char _m_d_name[1];
};
meaning that when it is assigned to some data where the name contains 17 elements (chars in this case), you see something akin to:
1000b030 5c a0 00 type_info
10 00 00
00 00 2e
1000b030 5c a0 00 10 type_inf type_info_vftable _vftable XREF[1]: 1000acb4(*)
1000b034 00 00 00 00 void * 00000000 _m_data
1000b038 2e 3f 41 56 char[1] "." _m_d_name
1000b038 [0] '.'
1000b03c 62 ?? 62h b
1000b03d 61 ?? 61h a
1000b03e 64 ?? 64h d
1000b03f 5f ?? 5Fh _
1000b040 74 ?? 74h t
1000b041 79 ?? 79h y
1000b042 70 ?? 70h p
1000b043 65 ?? 65h e
1000b044 69 ?? 69h i
1000b045 64 ?? 64h d
1000b046 40 ?? 40h @
1000b047 40 ?? 40h @
1000b048 00 ?? 00h
Describe the solution you'd like With the ability to parameterise the number of elements in each array within the datatype, the same type would be usable in multiple places which is more appropriate. For the example above you would then see:
1000b030 5c a0 00 type_info
10 00 00
00 00 2e
1000b030 5c a0 00 10 type_inf type_info_vftable _vftable XREF[1]: 1000acb4(*)
1000b034 00 00 00 00 void * 00000000 _m_data
1000b038 2e 3f 41 56 62 char[17] ".?AVbad_typeid@@" _m_d_name
61 64 5f 74 79
70 65 69 64 40...
Notice that the 3 characters ?AV are completely missing from the initial data display due to only a single character being shown and data between 1000b039-1000b03b being ignored.
Describe alternatives you've considered
Using size of 0 for the member: but this breaks other dissasembly as references to the zero'd member become references to the next element of an array, of in this case type_info structures, which don't exist as it is a single instance; also you then can have the correctly sized array of elements but they become disjoint from their owner.
Using size of 1 for the member: this solves the dissaasembly issue above, but you're left with the example illustrated in this request.
Using multiple copies of the datatype: this has consistancy problems etc.
Changing the packing of the structure solved the problem of the 3 missing charaacters ?AV!
A trailing flexable array member is expected to be declared with a 0-element count. In addition, it it best to enable packing on the structure when this is done. Any references to the member will be treated as a reference beyond the structure bounds. Below is an example which shows both the listing and structure editor for a similar case:
The decompiler will not render as a reference the last zero-length structure member (e.g., name) since its offset falls outside the bounds of the structure. It would require special logic within the decompiler to recognize as a structure member access.
The decompiler will not render as a reference the last zero-length structure member (e.g., name) since its offset falls outside the bounds of the structure. It would require special logic within the decompiler to recognize as a structure member access.
That is the point (for this simple case), I believe some kind of speciality could be used to ensure that the decompiler could see that an array is specified and (maybe give a user option to) reference that member.
Alternatively, use a size of 1 in the type definition to enable the decompiler to see the member and reference that, and add a per instance attribute to provide the actual size of the array member based upon its deployed location.