kaitai_struct_cpp_stl_runtime icon indicating copy to clipboard operation
kaitai_struct_cpp_stl_runtime copied to clipboard

Override uint64_t generated type for bit types

Open cmilhaupt opened this issue 4 years ago • 2 comments

Given the following KSY definition:

meta:
    id: bits_example
    endian: be
    bit-endian: be
seq:
- id: first_bit
  type: b2
- id: second_bit
  type: b2
- id: third_bit
  type: b2
- id: fourth_bit
  type: b2

The following C++ header is generated:

# kaitai-struct-compiler -t cpp_stl --cpp-standard 11 bit_example.ksy
# cat bits_example.h | grep uint64
    uint64_t m_first_bit;
    uint64_t m_second_bit;
    uint64_t m_third_bit;
    uint64_t m_fourth_bit;
    uint64_t first_bit() const { return m_first_bit; }
    uint64_t second_bit() const { return m_second_bit; }
    uint64_t third_bit() const { return m_third_bit; }
    uint64_t fourth_bit() const { return m_fourth_bit; }

Is there anyway in the KSY file definition to make these generate as a uint8_t to save some space? I tried type: b2.as<u1> seeing something similar for arrays, but this gives me the following error:

bit_example.ksy: /seq/0: error: parsing expression 'b2.as<[]u1>' failed on 1:3, expected "::" | CharsWhile(Set( , n)) | "\\\n" | End

Any help or clarifications would be appreciated.

cmilhaupt avatar Oct 24 '21 00:10 cmilhaupt

@cmilhaupt:

Is there anyway in the KSY file definition to make these generate as a uint8_t to save some space?

You have roughly the following options:

  1. Instead of using bX types, parse a u1 int and then unpack it with value instances manually - but you'll have to cast the values like .as<u1> again, because the compiler will by default assign type int32_t to any instance with a non-trivial (i.e. other than identity) integer expression:

    seq:
      - id: packed
        type: u1
    instances:
      a:
        value: ((packed & 0b1100_0000) >> 6).as<u1>
      b:
        value: ((packed & 0b0011_0000) >> 4).as<u1>
      c:
        value: ((packed & 0b0000_1100) >> 2).as<u1>
      d:
        value: ((packed & 0b0000_0011) >> 0).as<u1>
    
  2. Use the bX types, fork the compiler and adapt its behavior for your needs.

    I understand that it may sound intimidating at first, but it is by far the most elegant option. It should be easy in this case: you change the target type (CppCompiler.scala:1108 - obviously, you should choose the smallest type from uint{8,16,32,64}_t which it can fit into according to the width attribute - see the definition of BitsType) and type cast the read_bits_int_*() call (which returns uint64_t - see kaitai_struct_cpp_stl_runtime / kaitai/kaitaistream.h:151) to the target type (i.e. change the line CppCompiler.scala:773) - here is an example line doing just that for inspiration. Changing these two lines should be enough I think, then you build the modified compiler from source (which is made really straightforward with the sbt tool; you simply download it and run these commands to build it for the JVM/JavaScript environment respectively).

  3. Compile the spec using the bX types normally and then patch it manually, or write a script that loops over regex matches and does the modifications for you. This may work, but it is quite error-prone and chances are you'll end up with an invalid C++ code. The previous option is definitely more reliable and almost certainly easier if you ask me.

generalmimon avatar Oct 24 '21 09:10 generalmimon

@generalmimon thanks for the quick reply! Option 2 is certainly most elegant. I've never worked with Scala before, but I'll take a stab at it when I find some time. Thanks for linking to the lines that would need to change and for the example as well.

Follow-up question: would that approach adapt to arrays as well? I.e. if I have

- id: ex_arr
  type: b2
  repeat: expr
  repeat-expr: 4

would a std::vector<uint8_t> be generated automatically or would I need to adapt CppCompiler.scala elsewhere? Also could this fix be merged into the mainline? Thanks again.

cmilhaupt avatar Oct 24 '21 23:10 cmilhaupt