tms320c64x: use more descriptive types in public header
The new types:
- Fix issues where the field was unsigned but was assigned negative values
- Make it easier to determine the semantics of fields
- Make it easier to maintain language bindings
Also flatten anonymous typed structs.
Size savings:
- cs_tms320c64x: 284 -> 240
- tms320c64x_op_mem: 28 -> 24
The following were unsigned ints but assigned negative values:
cs_tms320c64x.funit.crosspathcs_tms320c64x.parallel
I still need help determining the intended semantics, particularly on:
- [ ]
cs_tms320c64x.funit_crosspath- Given values -1, 0, 1, 2 but explicitly checked to have value 1 in
test_tms320c64x.c
- Given values -1, 0, 1, 2 but explicitly checked to have value 1 in
- [ ]
tms320c64x_op_mem.disp- Should this be signed?
- Should this be a tag value & union since
TMS320C64xInstPrinter.c:printMemOperand()switches onmode?
- [ ]
cs_tms320c64x.parallel- treated as a
boolintest_tms320c64x.c/cstool_tms320c64x.c
- treated as a
this will break all bindings.
@fotisl please ack.
this will break all bindings.
Yes. I didn't clarify, but this should not be merged to any other branch. Since the API needed to be updated, I figured I'd go all the way.
I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct. I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either. Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.
@fotisl thanks for the comments, my responses are inline:
I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct.
How does leaving the funit information in one struct make things more clear? This adds an unnecessary struct that we need to handle in the bindings. Also, the C compilers potentially need to add extra alignment bytes, which increases size.
I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either.
I personally find needing to describe fewer types is simpler/less error-prone.
Also, the anonymous structs declared in the header, have no name, which means
- C code cannot instantiate types like
cs_tms320c64x.conditionorcs_tms320c64x.funit. - Binding authors need to "invent" names for these types, which means the bindings types do not map directly to the C header.
Python example: https://github.com/aquynh/capstone/blob/10f5ecc92f9d23bd4372f9dce413456dadd9a082/bindings/python/capstone/tms320c64x.py#L25-L29
Rust example: https://github.com/capstone-rust/capstone-rs/blob/5044ace8022b1651d1b7c93d41ad56993e0225f5/capstone-sys/pre_generated/capstone.rs#L9333-L9348
Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.
I agree that "flattening" the structs is less important than assigning more descriptive types to the existing unsigned int types.
If there is consensus, then I'll revert the "flattening" changes and just keep the new types for the existing fields. However, I still need someone more familiar with the code/tms320c64x to determine if the types I used are correct (see initial comment in PR).
@fotisl thanks for the comments, my responses are inline:
I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct.
How does leaving the funit information in one struct make things more clear? This adds an unnecessary struct that we need to handle in the bindings. Also, the C compilers potentially need to add extra alignment bytes, which increases size.
Please check the MCInst struct. All operands are different MCOperand structs in this struct. I consider it the same thing, keep all functional unit information together.
I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either.
I personally find needing to describe fewer types is simpler/less error-prone.
Well, note my example above :) Some of the most integral structs of capstone already hold information like this together and not split it in a flat structure.
Also, the anonymous structs declared in the header, have no name, which means
- C code cannot instantiate types like
cs_tms320c64x.conditionorcs_tms320c64x.funit.
Where do you need to instantiate these types?
- Binding authors need to "invent" names for these types, which means the bindings types do not map directly to the C header.
Python example: capstone/bindings/python/capstone/tms320c64x.py
Lines 25 to 29 in 10f5ecc
class TMS320C64xCondition(ctypes.Structure): fields = ( ('reg', ctypes.c_uint), ('zero', ctypes.c_uint), ) Rust example: https://github.com/capstone-rust/capstone-rs/blob/5044ace8022b1651d1b7c93d41ad56993e0225f5/capstone-sys/pre_generated/capstone.rs#L9333-L9348
I don't get these arguments. For example, cs_mips has cs_mips_op structures in it, which in turn have mips_op_type, mips_reg and mips_op_mem structures. Same applies to many other structures. Why does this specific structure has to be flattened?
Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.
I agree that "flattening" the structs is less important than assigning more descriptive types to the existing
unsigned inttypes.If there is consensus, then I'll revert the "flattening" changes and just keep the new types for the existing fields. However, I still need someone more familiar with the code/tms320c64x to determine if the types I used are correct (see initial comment in PR).
Can you please make a table with all the proposed changes so that I can review it? Old type -> new type -> reason (e.g. possible values in the code)
Can you please make a table with all the proposed changes so that I can review it? Old type -> new type -> reason (e.g. possible values in the code)
Sure thing (a list was easier).
struct tms320c64x_op_mem
- all types were
unsigned int tms320c64x_reg base- printed as register in
test_tms320c64x.c
- printed as register in
uint16_t disp- Perhaps this should be another union type:
// Depends on value of tms320c64x_op_mem.disptype field union tms320c64x_mem_disp { uint16_t constant, tms320c64x_reg reg, } // Rename enum tms320c64x_mem_disp to tms320c64x_mem_disp_type- Printed as unsigned or as register in
test_tms320c64x.cdepending ondisptypeneighbor field
uint8_t unit- Assigned values like
TMS320C64X_FUNIT_DinTMS320C64xInstPrinter.c
- Assigned values like
bool scaledTMS320C64xDisassembler.c: only assigns 0 or 1 (because of& 1)TMS320C64xInstPrinter.c: checksif (scaled)
tms320c64x_mem_disp disptypeTMS320C64xInstPrinter.cassigns valuesTMS320C64X_MEM_DISP_*
tms320c64x_mem_dir directionTMS320C64xInstPrinter.cassigns valuesTMS320C64X_MEM_DIR_*
tms320c64x_mem_mod modifyTMS320C64xInstPrinter.cassigns valuesTMS320C64X_MEM_MOD_*
struct cs_tms320c64x_op
- all types were
unsigned int tms320c64x_reg reg- Register type
- Printed as register in
TMS320C64xInstPrinter.c
struct cs_tms320c64x
- all types were
unsigned int tms320c64x_reg condition.reg- Register type
- Printed as register in
TMS320C64xInstPrinter.c
bool condition.zeroTMS320C64xDisassembler.c: only assigned zero or one- Tested for whether it is equal to 1 with
TMS320C64xInstPrinter.c/test_tms320c64x.c
tms320c64x_funit funit.unitTMS320C64xInstPrinter.cassigns valuesTMS320C64X_FUNIT_*
uint8_t funit.sideTMS320C64xInstPrinter.conly assigns values 0, 1, 2int8_t funit.crosspathTMS320C64xDisassembler.conly assigns values -1, 0, 1, 2- However,
test_tms320c64x.cchecks iftms320c64x->funit.crosspath == 1- Is this a bug in the test program?
int8_t funit.paralleltest_tms320c64x.cchecks if parallel is 1, printing "true" or "false"TMS320C64xDisassembler.conly assigns -1, 0, 1- Is there a bug in
test_tms320c64x.c? Should it check if!= 0instead?
I'll update the PR to only change the types as listed above.
It will be easier to "flatten" types later.
@fotisl I changed the PR to only change the types. Does that look okay to you? There are still the unresolved questions in the original comment. Thanks!
@fotisl Any update?
@aquynh if this looks good to you, could you merge?
@fotisl, can you ack?
Thank you for the PR! I closed it because it is out of date. With the new auto-sync update for v6 we made many changes to some main architectures and will do also to others. This also changed the requirements we have now for new PRs.
If you still want to merge the changes, please rebase your fix onto the newest next branch and open a new PR.