kaitai_struct
kaitai_struct copied to clipboard
Infinite recursion via `size` expression, speculative peeking(?)
I wanted to describe a "file list"-like file format. Practically an unaligned sequence of subfiles which start with their own size each.
I made it so that the size is looked up in the file type itself.
# ...
seq:
- id: files
type: file
size: files[_index].header.file_size
repeat: eos
types:
file_header:
seq:
- id: file_size
type: u4
file:
seq:
- id: header
type: file_header
- id: content
type: u4
repeat: eos
But this way, it will not compile. Trying around, I found, it will compile with instances
!
# ...
instances:
files:
pos: 0
type: file
size: files[_index].header.file_size
repeat: eos
types:
file_header:
seq:
- id: file_size
type: u4
file:
seq:
- id: header
type: file_header
- id: content
type: u4
repeat: eos
But the generated code is broken. When files
is evaluated in the Web IDE, files[_index]
will be undefined
in the size-expression. Another weird point: _
doesn't work as special variable in size expressions, maybe because there was no reason yet.
It get's worse:
# ...
instances:
first:
pos: 0
type: block0
size: first.header.block_size
others:
pos: first.header.block_size
type: block1
size: others[_index].header.block_size
repeat: eos
types:
block_header:
seq:
- id: block_size
type: u4
block0:
seq:
- id: header
type: block_header
- id: body
type: u4
repeat: eos
block1:
seq:
- id: header
type: block_header
- id: body
type: f4
repeat: eos
When evaluated, it gives an infinite recursion because of accessing itself _root.first
to obtain the KaitaiStream's size.
Only practical solution known to me is to strictly separate the header part from the body part in a seq
and subtract the header size to obtain the body size. A valid
expression would be needed to ensure a size constraint for the header.
The optimal solution in theory would be to "speculatively peek" parts of the field (just as much as needed to evalute the size expression) right before constructing the KaitaiStream (because in concept, KaitaiStreams have a fixed size). At construction, when it turns out that the already peeked part is too large for the computed size, it would give a runtime error. (This is a non-deterministic convenient way of describing computation.)
I believe, it would require an explicit way for this behaviour due to increased computational overhead to the standard case.
=> An attribute like speculative-size
whose expression is evaluated before the KaitaiStream is created. A more general alternative solution is theoretically possible like speculative-value
for instances
which then can be normally used in an expression.
Depending on your goals, this could be overkill and one existing solution with less convenience might be enough. In that case, I'd recommend this as an example in the documentation to warn against this problem.
At least speculative peeking is the way how I implemented message parsing for binary protocols in C for an introductory network programming class.
Good new year all!
You can try to move size
to your content
attribute:
seq:
- id: files
type: file
repeat: eos
types:
file_header:
seq:
- id: file_size
type: u4
file:
seq:
- id: header
type: file_header
- id: content
# type: u4
size: header.file_size - sizeof<file_header>
repeat: eos
Unfortunately, due to #788 you cannot use built-in types, such as u4
, as your content type
Thank you! Slightly off-topic: is "sizeof<file_header>" the type-specific pendant to "header._sizeof" ?
Yes.