capstone icon indicating copy to clipboard operation
capstone copied to clipboard

tms320c64x: use more descriptive types in public header

Open tmfink opened this issue 6 years ago • 11 comments

The new types:

  • Fix issues where the field was unsigned but was assigned negative values
  • Make it easier to determine the semantics of fields
  • Make it easier to maintain language bindings

Also flatten anonymous typed structs.

Size savings:

  • cs_tms320c64x: 284 -> 240
  • tms320c64x_op_mem: 28 -> 24

The following were unsigned ints but assigned negative values:

  • cs_tms320c64x.funit.crosspath
  • cs_tms320c64x.parallel

I still need help determining the intended semantics, particularly on:

  • [ ] cs_tms320c64x.funit_crosspath
    • Given values -1, 0, 1, 2 but explicitly checked to have value 1 in test_tms320c64x.c
  • [ ] tms320c64x_op_mem.disp
    • Should this be signed?
    • Should this be a tag value & union since TMS320C64xInstPrinter.c:printMemOperand() switches on mode?
  • [ ] cs_tms320c64x.parallel
    • treated as a bool in test_tms320c64x.c/cstool_tms320c64x.c

tmfink avatar Mar 17 '19 20:03 tmfink

this will break all bindings.

@fotisl please ack.

aquynh avatar Mar 18 '19 03:03 aquynh

this will break all bindings.

Yes. I didn't clarify, but this should not be merged to any other branch. Since the API needed to be updated, I figured I'd go all the way.

tmfink avatar Mar 18 '19 04:03 tmfink

I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct. I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either. Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.

fotisl avatar Mar 19 '19 10:03 fotisl

@fotisl thanks for the comments, my responses are inline:

I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct.

How does leaving the funit information in one struct make things more clear? This adds an unnecessary struct that we need to handle in the bindings. Also, the C compilers potentially need to add extra alignment bytes, which increases size.

I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either.

I personally find needing to describe fewer types is simpler/less error-prone.

Also, the anonymous structs declared in the header, have no name, which means

  1. C code cannot instantiate types like cs_tms320c64x.condition or cs_tms320c64x.funit.
  2. Binding authors need to "invent" names for these types, which means the bindings types do not map directly to the C header.

Python example: https://github.com/aquynh/capstone/blob/10f5ecc92f9d23bd4372f9dce413456dadd9a082/bindings/python/capstone/tms320c64x.py#L25-L29

Rust example: https://github.com/capstone-rust/capstone-rs/blob/5044ace8022b1651d1b7c93d41ad56993e0225f5/capstone-sys/pre_generated/capstone.rs#L9333-L9348

Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.

I agree that "flattening" the structs is less important than assigning more descriptive types to the existing unsigned int types.

If there is consensus, then I'll revert the "flattening" changes and just keep the new types for the existing fields. However, I still need someone more familiar with the code/tms320c64x to determine if the types I used are correct (see initial comment in PR).

tmfink avatar Mar 24 '19 17:03 tmfink

@fotisl thanks for the comments, my responses are inline:

I don't agree with some of the changes. For example, all information on the functional unit where the instruction is run and if it uses the crosspath, or the information on whether the instruction is conditional or not, should be in one struct.

How does leaving the funit information in one struct make things more clear? This adds an unnecessary struct that we need to handle in the bindings. Also, the C compilers potentially need to add extra alignment bytes, which increases size.

Please check the MCInst struct. All operands are different MCOperand structs in this struct. I consider it the same thing, keep all functional unit information together.

I don't think that it actually makes it easier to determine the semantics of the fields, but in contrast it is more difficult. Furthermore, are language bindings easier to maintain because some structs where removed? I don't think that this applies either.

I personally find needing to describe fewer types is simpler/less error-prone.

Well, note my example above :) Some of the most integral structs of capstone already hold information like this together and not split it in a flat structure.

Also, the anonymous structs declared in the header, have no name, which means

  1. C code cannot instantiate types like cs_tms320c64x.condition or cs_tms320c64x.funit.

Where do you need to instantiate these types?

  1. Binding authors need to "invent" names for these types, which means the bindings types do not map directly to the C header.

Python example: capstone/bindings/python/capstone/tms320c64x.py

Lines 25 to 29 in 10f5ecc

class TMS320C64xCondition(ctypes.Structure): fields = ( ('reg', ctypes.c_uint), ('zero', ctypes.c_uint), ) Rust example: https://github.com/capstone-rust/capstone-rs/blob/5044ace8022b1651d1b7c93d41ad56993e0225f5/capstone-sys/pre_generated/capstone.rs#L9333-L9348

I don't get these arguments. For example, cs_mips has cs_mips_op structures in it, which in turn have mips_op_type, mips_reg and mips_op_mem structures. Same applies to many other structures. Why does this specific structure has to be flattened?

Maybe just a type review would be better. In general, I don't think that making those changes is worth breaking the API.

I agree that "flattening" the structs is less important than assigning more descriptive types to the existing unsigned int types.

If there is consensus, then I'll revert the "flattening" changes and just keep the new types for the existing fields. However, I still need someone more familiar with the code/tms320c64x to determine if the types I used are correct (see initial comment in PR).

Can you please make a table with all the proposed changes so that I can review it? Old type -> new type -> reason (e.g. possible values in the code)

fotisl avatar Mar 26 '19 11:03 fotisl

Can you please make a table with all the proposed changes so that I can review it? Old type -> new type -> reason (e.g. possible values in the code)

Sure thing (a list was easier).

struct tms320c64x_op_mem

  • all types were unsigned int
  • tms320c64x_reg base
    • printed as register in test_tms320c64x.c
  • uint16_t disp
    • Perhaps this should be another union type:
    // Depends on value of tms320c64x_op_mem.disptype field
    union tms320c64x_mem_disp {
        uint16_t constant,
        tms320c64x_reg reg,
    }
    
    // Rename enum tms320c64x_mem_disp to tms320c64x_mem_disp_type
    
    • Printed as unsigned or as register in test_tms320c64x.c depending on disptype neighbor field
  • uint8_t unit
    • Assigned values like TMS320C64X_FUNIT_D in TMS320C64xInstPrinter.c
  • bool scaled
    • TMS320C64xDisassembler.c: only assigns 0 or 1 (because of & 1)
    • TMS320C64xInstPrinter.c: checks if (scaled)
  • tms320c64x_mem_disp disptype
    • TMS320C64xInstPrinter.c assigns values TMS320C64X_MEM_DISP_*
  • tms320c64x_mem_dir direction
    • TMS320C64xInstPrinter.c assigns values TMS320C64X_MEM_DIR_*
  • tms320c64x_mem_mod modify
    • TMS320C64xInstPrinter.c assigns values TMS320C64X_MEM_MOD_*

struct cs_tms320c64x_op

  • all types were unsigned int
  • tms320c64x_reg reg
    • Register type
    • Printed as register in TMS320C64xInstPrinter.c

struct cs_tms320c64x

  • all types were unsigned int
  • tms320c64x_reg condition.reg
    • Register type
    • Printed as register in TMS320C64xInstPrinter.c
  • bool condition.zero
    • TMS320C64xDisassembler.c: only assigned zero or one
    • Tested for whether it is equal to 1 with TMS320C64xInstPrinter.c/test_tms320c64x.c
  • tms320c64x_funit funit.unit
    • TMS320C64xInstPrinter.c assigns values TMS320C64X_FUNIT_*
  • uint8_t funit.side TMS320C64xInstPrinter.c only assigns values 0, 1, 2
  • int8_t funit.crosspath
    • TMS320C64xDisassembler.c only assigns values -1, 0, 1, 2
    • However, test_tms320c64x.c checks if tms320c64x->funit.crosspath == 1
      • Is this a bug in the test program?
  • int8_t funit.parallel
    • test_tms320c64x.c checks if parallel is 1, printing "true" or "false"
    • TMS320C64xDisassembler.c only assigns -1, 0, 1
    • Is there a bug in test_tms320c64x.c? Should it check if != 0 instead?

tmfink avatar Mar 30 '19 17:03 tmfink

I'll update the PR to only change the types as listed above.

It will be easier to "flatten" types later.

tmfink avatar Mar 30 '19 17:03 tmfink

@fotisl I changed the PR to only change the types. Does that look okay to you? There are still the unresolved questions in the original comment. Thanks!

tmfink avatar Mar 31 '19 02:03 tmfink

@fotisl Any update?

tmfink avatar Apr 17 '19 02:04 tmfink

@aquynh if this looks good to you, could you merge?

tmfink avatar May 02 '19 00:05 tmfink

@fotisl, can you ack?

aquynh avatar May 03 '19 15:05 aquynh

Thank you for the PR! I closed it because it is out of date. With the new auto-sync update for v6 we made many changes to some main architectures and will do also to others. This also changed the requirements we have now for new PRs.

If you still want to merge the changes, please rebase your fix onto the newest next branch and open a new PR.

Rot127 avatar Mar 20 '24 09:03 Rot127