kaitai_struct icon indicating copy to clipboard operation
kaitai_struct copied to clipboard

terminator: support multi-byte termination bytes

Open davidhicks opened this issue 8 years ago • 5 comments
trafficstars

In JPEG Interchange Format (including JFIF and SPIFF), the scan segment includes compressed data for which a length is not known until the compressed data has been fully read from the file. It is possible however to look for a 0xFF byte in the compressed data, which would be followed by 0x00 if this marker is to be ignored (escaped), or another byte (which can be multiple values) to denote the next segment of the file.

Ideally there would be a construct similar to:

- id: compressed_data
  terminator:
    - [0xFF, 0xAA] #next_marker_1
    - [0xFF, 0xBB] #next_marker_2
  consume: false

Wildcard bytes, regular expressions, number ranges and other helpers could also be of assistance in defining terminators in other file formats.

davidhicks avatar Apr 24 '17 02:04 davidhicks

Assign this to me

rodmartin30 avatar Mar 14 '19 13:03 rodmartin30

I suggest that, instead of supporting multibytes as a terminator, generalize by supporting a rule as a terminator, so a multibyte constant sequence would be a particular case.

dgutson avatar Mar 14 '19 14:03 dgutson

@GreyCat Can this be a temporal implementation until #538 is specified? If so, please assign this to me, since we need to finish the JPEG.

rodmartin30 avatar Mar 19 '19 14:03 rodmartin30

I have been working to handle multi-bytes terminator.

Let's suppose that the changes in Scala to change the type of terminator from int to Array[Byte] are made. (I just replace the int type for Array[Byte] and some minor changes but I would like to write a separate issue about that.)

One good thing to know is KMP algorithm to find matches in O(N + M) where N is the length of the pattern and M is the length of the text. Because of this the complexity of 'read_bytes_term()' doesn't change.

Here is the python-runtime commit with the changes: python-runtime

rodmartin30 avatar Mar 25 '19 15:03 rodmartin30

Any progress on this?

StefanRickli avatar May 16 '22 17:05 StefanRickli